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On  page  13  it  would  be  desirable  to  replace  the  line  following  equation  (5.3) 
by  the  following! 
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follows  from  properties  (l)-(4)  that  ty  depends  only  on  j/ 
and  is  independent  of  N.) 
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ABSTRACT 

The  use  of  a  symmetrical  moving  weighted  average  of  2m  +  1  terms  to 
smooth  equally  spaced  observations  of  a  function  of  one  variable  does  not 
yield  smoothed  values  of  the  first  m  and  the  last  m  observations,  unless 
additional  data  beyond  the  range  of  the  original  observations  are  available. 
By  means  of  analogies  to  the  Whittaker  smoothing  process  and  some  related 
mathematical  concepts,  a  natural  method  is  developed  for  extending  the 
smoothing  to  the  extremities  of  the  data  as  a  single  overall  matrix-vector 
operation  having  a  well  defined  structure,  rather  than  as  something  extra 
grafted  on  at  the  ends. 
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SIGNIFICANCE  AND  EXPLANATION 


The  use  of  a  moving  weighted  average  of  2m  +  I  terms  to  smooth 
equally  spaced  observations  of  a  function  of  one  variable  does  not 
yield  smoothed  values  of  the  first  m  and  the  last  m  observations, 
unless  additional  data  beyond  the  range  of  tha  original  observations 
are  available.  Using  Toeplitz  matrices,  Laurent  series,  and  analogies 
to  the  Whittaker  smoothing  process,  we  develop  a  natural  method  of 
extending  the  smoothing  to  the  extremities  of  the  data. 


summary  lies  with  MRC,  and  not  with  tha  author  of  this  report. 


MOVING-WEIGHTED~ AVERAGE  SMOOTHING 
EXTENDED  TO  THE  EXTREMITIES  OF  THE  DATA 

T.  N.  E.  Greville 


1.  INTRODUCTION 

A  time-honored  method  o£  smoothing  equally  spaced  observations  of  a 
function  of  one  variable  to  remove  or  reduce  unwanted  irregularities  is  the 
moving  weighted  average  (MMA) .  An  example  is  Spencer's  15-term  average 
(Macaulay  1931;  Henderson  1938) ,  which  can  be  expressed  in  the  form 

“«  '  sio(-3y«-7  -  6V«  -  ^x-5  *  3V«  ♦  !lVj  ♦  46V2  4  67Vl 

♦  Hrx  ♦  67yxtl  ♦  «yxt2  ♦  Jly^  ♦  Jy^  -  5,^  -  6,^  -  3»x<7> . 


(1.1) 


where  yx  is  the  observed  value  corresponding  to  the  argument  x,  and  ux 
is  the  corresponding  adjusted  value.  Actuarial  writers  commonly  refer  to 
such  smoothing  as  "graduation." 

More  generally  (Schoenberg  1946)  a  symmetrical  MWA  is  of  the  form 


“*  ■  j-L 


where  m  is  a  given  positive  integer  and  the  real  coefficients  c^  are  such 
that  c_j  »  c^  and 


l  c  -1. 

j-m  3 


Such  averages  have  a  long  history  that  is  not  widely  known.  One  of  the  ear¬ 
liest  writers  on  the  subject  was  the  Italian  astronomer  G.  V.  Schiaparelli 
(1866),  who  is  remembered  chiefly  for  his  observations  of  the  planet  Mars. 
Futher  contributions  were  made  by  the  Danish  actuary  and  mathematician 
J.  P.  Gram  and  the  Danish  astronomer  T.  N.  Thiele,  both  of  whom  played  major 
roles  in  the  early  development  of  statistical  theory.  The  majority  of  publi¬ 
cations  on  this  subject  have  appeared  in  English  and  Scottish  actuarial  jour¬ 
nals  starting  with  John  Finlaison  in  1829  (see  Maclean  1913) .  probably  the 
first  writer  to  make  a  systematic  investigation  of  such  averages  was  the 
Mmrican  mathematician  E.  L.  Da  Forest  (1873,  1875,  1876,  1877).  His  work, 
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published  in  obscure  places,  was  rescued  from  total  oblivion  largely  through 
the  efforts  of  Hugh  H.  Wolfenden  (1892-1968),  who  also  made  important  contri¬ 
butions  to  the  subject  (Wolfenden  1925).  E.  T.  Whittaker  (1923)  suggested  an 
alternative  method  of  smoothing,  which  has  been  widely  employed ,  especially 
by  actuaries,  and  will  be  referred  to  extensively  later,  because  of  numerous 
analogies  to  the  MWA  procedure.  The  first  writer  to  apply  sophisticated 
mathematical  tools  to  the  study  of  these  averages  was  I.  J.  Schoenberg  (1946, 
1948,  1953),  who  introduced  the  notion  of  the  characteristic  function  of  an 
MWA,  and  utilized  it  to  formulate  a  criterion  for  judging  whether  a  given 
average  can  properly  be  called  a  "smoothing  formula."  This  criterion  will  be 
discussed  in  Section  11. 

2.  THE  PROBLEM  OF  SMOOTHING  NEAR  THE  EXTREMITIES  OF  THE  DATA 

When  MWA's  have  been  used  by  actuaries,  the  argument  x  is  usually  age 
(of  a  person)  in  completed  years.  When  they  are  used  for  smoothing  economic 
time  series,  x  denotes  the  position  of  a  particular  observation  in  a  time 
sequence.  The  latter  area  of  application  appears  to  stem  largely  from  the 
work  of  Frederick  R.  Macaulay  (1931),  who  was  the  son  of  an  actuary. 

In  either  case,  a  serious  disadvantage  of  the  method  is  that  it  does  not 
produce  adjusted  values  for  arguments  too  near  the  extremities  of  the  data. 

For  example,  suppose  Spencer's  15-term  average  is  used  to  smooth  monthly  data 
extending  from  1970  through  1976.  The  formula  does  not  give  smoothed  values 
for  the  first  7  months  of  1970  or  the  last  7  months  of  1976  unless  data  can 
be  obtained  for  the  last  7  months  of  1969  and  the  first  7  months  of  1977. 
Clearly,  acquisition  of  data  extending  farther  into  the  past  is  less  of  a  prob¬ 
lem  than  acquisition  of  future  data. 

Actuaries  in  North  America  seem  to  have  largely  abandoned  the  use  of 
MWA's  in  favor  of  Whittaker's  method,  which  does  not  have  the  disadvantage 
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described.  It  is  likely  that  British  actuaries  may  still  use  these  averages 
to  some  extent.  They  appear  to  be  currently  employed  by  economic  and  demo-* 
graphic  statisticians  (Shiskin  and  Eisenpress  1957;  Shiskin,  Young,  and 

*  Musgrave  1967) . 

Various  suggestions  have  been  made  (De  Forest  1877;  Miller  1946;  Greville 
1957,  1974a)  for  dealing  with  the  problem  of  adjustment  of  data  near  the  ex¬ 
tremities,  but  none  of  them  has  won  general  acceptance.  De  Forest's  (1877, 
p.  110)  suggestion  is  so  relevant  to  the  subject  of  the  present  paper  that  it 
is  worth  quoting  in  full: 

"As  the  first  m  and  the  last  m  terns  of  the  series  cannot  be  reached 
directly  by  the  formula  [of  2m  +  1  terms] ,  the  series  should  be  graphically 
extended  by  m  terms  at  both  ends,  first  plotting  the  observations  on  paper 
as  ordinates,  and  then  extending  the  curve  along  what  seems  to  be  its  probable 
course,  and  measuring  the  ordinates  of  the  expended  portions.  It  is  not  neces- 

*  sary  that  this  extension  should  coincide  with  what  would  be  the  true  course 
of  the  curve  in  those  parts.  The  important  part  is  that  the  m  terms  thus 

I 

added,  taken  together  with  the  m  +  1  adjacent  given  terms,  should  follow  a 
curve  whose  form  is  approximately  algebraic  and  of  a  degree  not  higher  than 
the  third." 

Elsewhere  (Greville  1974a)  I  have  proposed  extrapolating  the  observed 
data  by  fitting  a  least-squares  cubic  to  the  first  m  +  1  values  and  a  similar 
cubic  to  the  last  m  +  1  observations.  Though  my  proposal  was  made  before  I 
had  noted  the  passage  just  quoted  from  De  Forest,  it  is  very  much  in  the  spirit 
of  his  suggestion;  it  is  not  a  long  step  from  graphic  to  algebraic  extra- 

» 

polation. 

*  Another  approach  (Greville  1957)  regards  the  adjustment  process  as  a 
matrix-vector  operation.  We  write 
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where  y  is  the  vector  of  observed  values,  u  is  the  corresponding  vector 
of  adjusted  values,  and  G  is  a  square  matrix.  If  a  specified  symmetrical 
MWA  of  2m  +  1  terms  is  to  be  used  wherever  possible,  then  the  nonzero 
elements  of  G  ,  except  for  the  first  m  and  the  last  m  rows,  are  merely 
the  weights  in  the  moving  average,  these  weights  moving  to  the  right  as  one 
proceeds  down  the  rows  of  the  matrix.  In  the  first  m  and  the  last  m  rows 
special  unsymmetrical  weights,  determined  in  same  appropriate  manner,  must  be 
inserted.  The  matrix  approach  and  the  extrapolation  approach  are  not  wholly 
unrelated,  since  the  final  results  of  the  extrapolation  approach  can  be  ex¬ 
pressed  in  matrix  form. 

It  is  the  purpose  of  the  present  paper  to  show  that  when  a  given  sym¬ 
metrical  MWA  is  being  employed  and  fulfills  certain  minimal  requirements,  there 
is  a  natural,  preferred  method  of  extending  the  adjustment  to  the  extremities 
of  the  data,  strongly  suggested  by  the  mathematical  properties  of  the  weighted 
average.  This  natural  method  of  extension  seems  to  have  eluded  previous 
writers  on  the  subject,  as  indeed  it  eluded  me  during  the  many  years  I  have 
thought  about  the  matter.  The  preferred  method  of  extension  has  the  inter¬ 
esting  property  that  it  can  be  arrived  at  either  through  the  matrix  approach 
or  the  extrapolation  approach,  in  the  latter  case,  one  must  employ  a  special 
extrapolation  formula  uniquely  determined  by  the  given  MWA.  Though  the  two 
approaches  appear  to  be  quite  different,  they  will  be  shown  in  Section  8  to  be 
mathematically  equivalent,  and  they  will  give  identical  results  except  for 
rounding  error.  In  the  matrix  approach  the  treatment  of  the  values  near  the 
ends  becomes  an  integral  part  of  a  single  overall  operation,  and  not  something 
extra  grafted  on  at  the  ends.  It  is  especially  fitting  that  it  should  be 
published  now,  since  the  centennial  of  De  Forest's  death  occurs  in  the  1980's. 


In  my  own  thinking  I  arrived  at  the  procedure  first  through  the  matrix 
approach,  guided  largely  by  extensive  analogies  to  the  Whittaker  process 
(which  is  most  conveniently  expressed  in  matrix  terms).  It  was  only  later 
that  I  became  aware  that  identical  results  could  be  obtained  by  means  of  an 
extrapolation  algorithm.  Though  the  matrix  approach  provides  greater  insight 
into  the  rationale  behind  the  procedure,  the  extrapolation  approach  is  simpler 
computationally.  Therefore,  we  shall  first  describe  and  illustrate  the  extra¬ 
polation  algorithm,  and  shall  then  motivate  and  justify  the  procedure  by  means 
of  the  matrix  approach.  This  investigation  has  led  to  some  interesting  mathe¬ 
matical  developments  (Greville  and  Trench  1979;  Greville  1979;  Greville  1980) 
that  have  been  published  elsewhere  in  a  more  general  context  and  will  be  cited 
here  as  the  need  arises. 

The  extrapolation  approach  is  merely  a  computational  short  cut,  and  nearly 
always  the  extended  values  obtained  by  its  use  are  highly  unrealistic  if  re¬ 
garded  as  extrapolated  values  of  the  function  under  observation .  (The  reader 
will  note  that  the  quotation  from  De  Forest  earlier  in  this  section  contains 
a  similar  admonition.)  This  fact  is  irrelevant,  but  has  seriously  "turned 
off"  some  users.  Hereafter  I  shall  therefore  avoid  the  use  of  the  words 
"extrapolate"  and  "extrapolation,"  and  shall  speak  of  "extension,"  "extended 
values,"  and  "intermediate  values." 

It  is  emphasized  that  the  procedure  to  be  described  (or  any  other  proce¬ 
dure  for  completing  the  graduation)  is  recommended  for  use  only  when  addi¬ 
tional  data  extending  beyond  the  range  of  the  original  observations  are  not 
available. 

Probably  some  readers  will  be  primarily  concerned  with  the  application 
of  the  method  to  mmwrical  data,  and  will  have  less  interest  in  its  mathe¬ 
matical  development.  Such  readers  will  find  the  information  they  require  in 


the  following  Sections  3  and  4.  on  the  other  hand,  readers  who  may  wis. 
pursue  the  mathematical  derivation  first  and  leave  computational  details 
later  may  skip  Sections  3  and  4  and  pass  at  once  to  Section  5. 

3.  THE  EXTENSION  ALGORITHM 

A  weighted  average  of  the  form  (1.2)  will  be  called  exact  for  the  degree 
r  if  it  has  the  property  that,  in  case  all  the  observed  values  y  in  (1." 

X-] 

should  happen  to  be  the  corresponding  ordinates  of  same  polynomial  P(x  -  j) 
of  degree  r  or  less,  then 

“x  “  »,  *  p(x)  ■ 

but  there  is  same  polynomial  of  degree  r  +  1  for  which  this  is  not  the  case. 
In  other  words,  an  average  that  is  exact  for  the  degree  r  reproduces  without 
change  polynomials  of  degree  r  or  less,  but  not  in  general  those  of  higher 
degree.  If  the  weights  are  symmetrical,  r  must  be  odd,  and  we  may  write 
r  ■  2s  -  1.  This  implies  that  r  <  2m  +  1  ,  and  therefore  s  _<  m  . 

For  a  simple  (unweighted)  average,  r  =*  1.  For  the  overwhelming  majority 
of  MWA's  used  in  practice,  r  *  3.  The  preference  for  cubics  has  a  long  his¬ 
tory.  De  Forest  (1873,  p.  281)  suggests  that  "a  curve  of  the  third  degree, 
which  admits  a  point  of  inflexion  ...  is  ...  better  adapted  than  the  common 
parabola  to  represent  the  form  of  a  series  whose  second  difference  changes  its 
sign." 

We  shall  use  the  notation  of  the  calculus  of  finite  differences,  wherein 
E  is  the  "displacement  operator"  or  "shift  operator"  defined  by 

Ef(x)  -  f (x  +  1)  , 

and  6  is  the  "central  difference"  operator  defined  by 

6f(x)  -  f(x  +  1/2)  -  f(x  -  1/2),  (3.1) 


so  that 


62f(x)  -  f(x  +  1)  -  2f (x)  +  f(x  -  1) 


If  the  weighted  average  (1.2)  is  exact  for  the  degree  2s  -  1,  it  can  be 
written  in  the  form 

ux  =  (1  -  (-1)®  628  q(E)]yx  ,  (3.2) 

where  q(E)  is  of  the  form 

m-s  . 

q(E)  =  l  q.  E3  (3.3) 

j=-m+s  3 

with  q  =  q^.  In  a  typical  smoothing  formula  q(E)  has  only  positive  co¬ 
efficients,  but  this  is  not  necessarily  the  case.  If  q(z)  is  multiplied  by 
zm  s  to  eliminate  negative  exponents,  the  resulting  polynomial  is  of  degree 
2m  -  2s.  Because  of  the  symmetry  of  the  coefficients,  it  is  a  reciprocal 
polynomial.  In  other  words,  if  p  is  a  zero  of  the  polynomial,  it  follows 
that  p  1  is  a  zero.  In  general,  we  shall  make  the  assumption  that  this 
polynomial  has  no  zeros  on  the  unit  circle.  The  case  in  which  it  does  have 
such  zeros  is  mainly  of  theoretical  interest  and  is  briefly  referred  to  in 
Section  7. 

Let  p (z)  denote  the  polynomial  of  degree  m-s  with  leading  coef¬ 
ficient  unity  whose  zeros  are  the  m-s  zeros  of  zm  Sq(z)  located  within 
the  unit  circle.  In  general,  some  or  all  of  these  zeros  are  complex,  but 
they  must  occur  in  conjugate  pairs,  so  that  p(z)  has  real  coefficients.  Now 

we  define  a  polynomial  a(z)  of  degree  m  and  its  coefficients  a.  by 

m  .  3 

a(z)  =*  (z  -  1)S  p  (z)  *  zm  -  £  a.  zm  3  .  (3.4) 

j=l  3 

Suppose  the  given  data  consist  of  N  =  Q  -  P  +  1  given  values  extending 
from  x  ■  P  to  x  *  Q.  we  assume  that  N  >_  2m  +  1,  so  that  at  least  one 
smoothed  value  is  obtained  by  direct  application  of  the  given  MWA.  Then  we 
obtain  m  intermediate  values  to  the  left  of  x  ■  P  by  successive  applica¬ 
tion  of  the  recurrence 

yx  "  ^  aj  yx+j* 

Similarly,  m  intermediate  values  to  the  right  of  x  -  Q  will  be  obtained  by 


the  analogous  recurrence 


y  ■  T  a.  y 
x  ^  3  Jx-j 

Finally,  application  of  the  syranetrical  MWA  of  2m  +  1 
observed  and  intermediate  values  gives  adjusted  values 
P  +  1, « . • ,Q. 


terms  to  the  N  +  2m 

u  for  x  =  P  , 
x 


For  example,  Spencer's  15-term  formula  (1.1)  can  be  expressed  in  the  form 
(3.2)  with  s  =  2,  where 

q(E)  «  (3E~5  +  18E-4  +  59E“3  +  137E-2  +  242E-1  +  318  +  242E  +  137E2 

+  59E3  +  18E4  +  3E5). 

Using  a  computer  program  to  find  the  zeros  of  z3  q(z),  constructing  the  poly¬ 
nomial  p(z),  and  finally  applying  the  formula  (3.4),  we  obtain  for  Spencer's 
15-term  formula 

a (z)  -  z7  -  .961572Z6  -  .372752Z5  -  .015904z4  +  .123488z3  +  .125229z2 

+  ,075887z  +  .025624. 


The  coefficients  are  rounded  to  the  nearest  sixth  decimal  place,  except  that 

3  2 

the  final  digits  of  the  coefficients  of  z  and  z  have  been  adjusted  by 
one  unit  to  make  the  sum  of  the  coefficients  exactly  zero. 

Note  that  in  the  trivial  case  s  ■  m  ,  q(z)  is  a  constant  and  p(z)  is 
unity.  Thus  the  algorithm  reduces  to  extrapolation  of  the  observed  data  by 
sth  differences  (i.e.,  by  fitting  a  polynomial  of  degree  s  -  1  to  the  first 
s  observations  and  a  similar  polynomial  to  the  last  s  observations) . 

As  a  numerical  illustration,  Spencer's  15-term  average  has  been  applied 
to  seme  meteorological  data.  Table  1  and  Figure  A  show  the  observed  and 
graduated  values  of  monthly  precipitation  in  Madison,  Wisconsin  in  the  years 
1967-71.  No  adjustment  has  been  made  for  the  unequal  length  of  the  months. 


4.  TABLES  OF  MOVING-AVERAGE  AND  EXTENSION  COEFFICIENTS 


Tables  2  and  3  show  the  coefficients  in  the  MWA  and  the  corresponding 
extension  coefficients  (that  is,  c^  and  a^)  for  21  weighted  averages  that 
have  appeared  in  the  literature.  Table  2  is  devoted  to  the  class  of  averages 
known  to  actuaries  as  minim um-F  formulas  and  to  economic  statisticians  as 
"Henderson's  ideal"  formulas.  They  are  discussed  more  fully  in  Section  8. 

The  values  in  Table  2  are  shown  to  six  decimal  places .  In  both  instances ,  a 
few  final  digits  have  been  adjusted  by  one  unit  to  make  the  sum  exactly  unity. 
The  moving-average  coefficients  are  given  to  the  nearest  sixth  decimal  place 
except  for  the  slight  adjustments  mentioned;  rounding  error  in  the  computation 
of  the  extension  coefficients  may  have  introduced  further  small  errors  in  some 
instances. 

Table  3  is  concerned  with  11  moving  averages  derived  by  various  writers 
on  am  ad  hoc  basis  and  known  by  the  names  of  their  originators.  The  source 
notes  for  this  table  do  not  attempt  to  cite  the  earliest  publication  of  the 
formula  in  question,  but  merely  indicate  a  convenient  reference  where  it  can 
be  found.  All  these  averages  are  exact  for  cubics  except  Hardy's,  which  is 
exact  only  for  linear  functions.  The  coefficients  in  the  averages  of  Table  3 
are  rational  fractions  with  relatively  small  denominators,  and  the  user  will 
probably  find  it  convenient  to  use  as  weights  the  integers  in  the  numerators 
of  the  coefficients,  dividing  by  the  common  denominator  as  the  final  step. 

The  column  headings,  therefore,  are  c^  multiplied  by  the  common  denominator . 

In  both  Tables  2  and  3  advantage  has  been  taken  of  the  symmetry  of  the 
coefficients  c^  to  reduce  the  length  of  the  columns  by  approximately  one- 
half.  The  manner  of  using  the  tables  may  be  illustrated  by  taking  Spencer's 
15-term  average  as  an  example.  Equation  (1.1)  shows  the  calculation  of  the 
moving  averages.  The  intermediate  values  yx  for  x  -  P  -  1  to  P  -  7  are 


I 


calculated  successively  by  the  formula 

yx  =  .961572yx+1  +  .372752yx+2  +  .015904yx+3  -  .123488yx+4  -  .12522< 

-  .075887yx+6  -  .025624yx+?  . 

The  intermediate  values  for  x  =  Q  +  1  to  Q  +  7  are  calculated  by  the  iden¬ 
tical  formula  except  that  the  plus  signs  in  the  subscripts  are  changed  to 
minus  signs. 

The  extension  procedure  drastically  reduces  the  number  of  values  that 
need  to  be  tabulated  for  a  given  weighted  average,  and  makes  it  possible,  for 
example,  to  give  complete  information  about  21  such  averages  in  the  reasonably 
compact  Tables  2  and  3.  However,  the  user  who  intends  to  apply  a  single 
weighted  average  to  many  data  sets  may  prefer  to  tabulate  the  atypical  ele¬ 
ments  of  the  smoothing  matrix  G  for  that  weighted  average,  and  so  avoid  the 
extra  step  of  calculating  the  intermediate  values.  For  the  benefit  of  such 
users,  a  method  of  calculating  the  atypical  rows  of  G  will  now  be  described. 
We  observe  that  the  nonzero  elements  of  each  row  of  G  except  the  first  m 
and  the  last  m  rows  are  merely  the  coefficients  c_.  of  the  MWA  centered 
about  the  diagonal  element.  The  elements  in  the  first  m  rows  of  G  ,  ex¬ 
cept  for  the  first  m  columns,  follow  from  the  symmetry  of  G  ,  and  if 

G  ■  (g, .)  we  have 
i} 

9ij  "  °j-i  * 

This  leaves  only  the  square  submatrix  of  order  m  in  the  upper  left  corner 
to  be  calculated.  Let  c  denote  the  constant  q_  /p  ,  where  p  is 

the  term  free  of  z  in  the  polynomial  p(z)  ,  and  let  A^  ■  (a^)  denote  the 
square  matrix  of  order  m  given  by 

0  for  i  >  j 

a^  *  ■  1  for  i  ■  j 

-a^  for  i  <  j  . 
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Then  the  required  submatrix  in  the  upper  left  corner  of  G  is  given  by 

T 

1  —  cA^  * 

where  the  superscript  T  denotes  the  transpose .  The  similar  submatrix  in  the 
lower  right  corner  of  G  contains  the  same  elements,  but  with  the  order  of 
both  rows  and  columns  reversed.  Justification  for  this  procedure  lies  in  the 
fact  that  F  =  I  -  G  is  a  symmetric  Trench  matrix  (see  the  following  section 
5). 

5.  THE  GRADUATION  MATRIX 

In  order  to  describe  the  unique  graduation  matrix  G  of  (2.1)  that 
arises  when  the  preferred  method  of  overall  graduation  is  used,  it  is  neces¬ 
sary  to  define  certain  special  classes  of  matrices.  A  square  matrix 
N 

M  =  (m_.  „  will  be  called  a  band  matrix  if  there  are  nonneqative  inte- 

1D  i/J-u  1 

gers  h  and  k  such  that  «■  0  whenever  j  -  i  >  h  and  also  whenever 

i  -  j  >  k  .  Note  that  we  have  started  the  numbering  of  rows  and  columns  with 
0  rather  than  1  .  M  will  be  called  strictly  banded  if,  in  addition, 
h  +  k  <_  N  .  in  all  the  banded  and  strictly  banded  matrices  to  be  discussed 
in  this  paper,  h  and  k  will  be  equal. 

M  is  called  a  Toeplitz  matrix  if  all  the  elements  on  each  stripe  are 
equal,  where  a  stripe  (Thrall  and  Tomheim  1957)  is  either  the  principal  diag¬ 
onal  of  the  matrix  or  any  diagonal  line  of  elements  parallel  to  the  main  diag¬ 
onal  .  In  other  words,  M  is  a  Toeplitz  matrix  if  there  exists  a  sequence 
t_N  »t.N+i#***#tN  such  that  all  i  and  j 

mi j  *  tj-i  * 

A  strictly  banded  matrix  will  be  called  a  Trench  matrix  if  it  has  a  spe¬ 
cial  structure  that  will  now  be  described.  Let  M^(x)  denote  the  generating 
function  of  the  elements  of  the  ith  row:  thus, 

M.(x)  -  l  m  xJ  . 

j-0  13 
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Then  M  is  a  Trench  matrix  if 

*  -1 

A(x)  2  B  .x  3  (0  <_  i  <  k) 

j-0  3 

M.(x)-<  A(x)B (1/x)  (k  <_  i  <_  N  -  h)  (5.1) 

1  N  i 

B  (1/x)  l  a  x3  (N  -  h  <  i  <_  N) , 

j*N-i  3 
h  k 

where  A(x)  -  £  a.x3  and  B(x)  *  £  $.x3  are  given  polynomials  of  degree 

j«0  3  j»0  3 

h  and  k  ,  respectively,  with  aQB0  +  0  .  In  all  the  applications  of  Trench 
matrices  to  be  made  in  this  paper,  h  =  k  and  the  coefficients  a_.  and  B_. 
are  real.  Careful  examination  of  (5.1)  reveals  that  a  Trench  matrix  is  quasi - 
Toeplitz .  By  this  is  meant  that  the  Toeplitz  property 

“i+l,j+l  “  ®ij 

holds  so  long  as  neither  of  these  elements  is  contained  in  the  k  by  h  sub¬ 
matrix  in  the  upper  left  corner  of  M  or  in  the  h  by  k  submatrix  in  the 
lower  right  corner.  The  converse,  however,  is  not  true.  A  strictly  banded, 
quasi -Toeplitz  matrix  is  not  necessarily  a  Trench  matrix.  In  a  Trench  matrix 
the  elements  of  the  special  comer  submatrices  must  be  related  in  a  particular 
way  to  those  of  the  main  part  of  the  matrix  as  indicated  by  (5.1) . 

Greville  and  Trench  (Greville  and  Trench  1979;  Greville  1979,  I960)  have 
studied  the  properties  of  Trench  matrices.  In  the  joint  paper  they  have  shown 
that  a  strictly  banded  matrix  has  a  Toeplitz  inverse  if  and  only  if  it  is  a 
nonsingular  Trench  matrix,  and  further  that  a  Trench  matrix  is  nonsingular  if 
and  only  if  A(x)  and  B(l/x)  have  no  common  zero. 

A  rectangular  matrix  X  of  N  -  s  rows  and  N  columns  will  be  called  a 
differencing  matrix  if  it  transforms  a  column  vector  into  the  column  of  sth 
finite  differences  of  the  elements  of  the  vector.  Evidently  the  nonzero  ele¬ 
ments  of  each  row  of  X  are  the  successive  binomial  coefficients  of  order  s 
with  alternating  signs,  and  the  nonzero  elements  of  X  form  a  diagonal  band 
from  the  upper  left  to  the  lower  right. 


The  following  theorem  describes  the  smoothing  matrix  G  of  the  natural 


extension. 

Theorem  5.1,  Let  a  given  MWA  of  2m  +  1  terms  be  exact  for  the  degree 
2s  -  1  (with  s  <_ m)  and  such  that  q(z)  has  no  zeros  on  the  unit  circle 
and  qQ  >  0  .  Then,  for  every  N  2m  +  1  ,  there  is  a  unique  square  matrix 
G  of  order  N  having  the  following  five  properties: 

(1)  If  y  is  any  vector  of  observed  values  and  u  =  Gy  is  regarded  as 
the  corresponding  vector  of  graduated  values,  the  elements  of  u  ,  except  the 
first  m  and  the  last  m  ,  are  merely  the  graduated  values  that  would  be  ob¬ 
tained  by  the  use  of  the  given  MWA. 

(2)  G  is  strictly  banded  with  h  =  k  *  m. 

(3)  G  is  of  the  form 

G  «  I  -  KT  DK  (5.2) 

for  some  square  matrix  0  of  order  N  -  s. 

(4)  D  is  nonsingular  and  has  a  Toeplitz  inverse. 


The  unique  matrix  G  so  determined  has  the  following  properties: 

(6)  D  is  unique  and  G  and  D  are  symnetric. 

(7)  The  nonzero  elements  of  the  first  m  and  the  last  m  rows  of  G 
depend  only  on  the  given  MWA  and  do  not  depend  on  N  .  A  similar  statement 
applies  to  the  nonzero  elements  of  the  first  m  -  s  and  the  last  m  -  s  rows 
of  D  . 
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(8)  0  is  a  Trench  matrix  with  B(x)  *  A(x).  Moreover, 

q  (x)  -  A(X)  A(l/x)  (5.4  ' 

and  the  m  -  s  zeros  of  A(x)  are  those  zeros  of  q(x)  that  are  outside  the 
unit  circle. 

T 

(9)  F  =  I  -  G  =  K  DK  is  a  (singular)  Trench  matrix  characterized  by 
the  polynomials 

A(x)  *  B (x)  *  (x  -  1)S  A(x)  . 

(10)  The  series  (5.3)  converges  to  [q(z)]  1  in  an  annulus  containing 
the  unit  circle. 


He  shall  defer  the  proof  of  this  theorem  until  after  some  explanation  and 
motivation  have  been  given.  He  begin  with  a  numerical  example,  which  may  help 
to  clarify  the  relationships  involved.  Then  we  shall  seek  to  justify  the  im¬ 
position  of  the  five  properties  that  uniquely  determine  the  graduation  matrix 
G  .  In  Section  7  we  shall  prove  the  theorem,  and  in  Section  8  we  shall  show 
how  the  extension  algorithm  can  be  rationalized  and  prove  that  it  is  mathe¬ 
matically  equivalent  to  the  matrix  formulation. 

For  virtually  all  the  MHA's  likely  to  be  used  in  practice  the  elements 
of  the  square  submatrices  of  order  m  in  the  upper  left  and  lower  right  cor¬ 
ners  of  G  are  irrational.  However,  for  convenience  of  illustration,  we  have 


contrived  an  example  with  rational  elements.  This  is  the  MWA, 

“«  ■  ?  «Va  *  Vl  *  ♦  Vi  +  2W  < 

which  is  exact  only  for  linear  functions  and  is  unlikely  to  be  used  in  a  prac¬ 


tical  situation.  However,  it  does  satisfy  Schoenberg's  criterion  (see  Section 
11)  for  a  satisfactory  smoothing  formula.  Here  m  ■  2,  s  ■  1,  q(x)  ■ 

(2x_1  +  5  +  2x) ,  and  A(x)  -  j  (2  +  x) .  For  N  ■  7  , 
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Finally, 

tv  «  3{-l/2)iVl  , 

and  it  is  easily  verified  that  the  series  (5.3)  does  in  fact  converge  to 
[q (z) ] -1  =  9(2z-1  +  5  +  2z)-1  for  1/2  <  z  <  2  .  So  long  as  N  >_  5  ,  the 
corresponding  results  for  any  value  of  N  could  easily  be  written  down. 

Let  us  now  look  at  the  five  properties  that  uniquely  determine  G  .  The 
first  of  these  is  no  more  than  a  restatement  of  the  problem  to  be  solved.  The 
second  is  a  reasonable  requirement  and  amounts  to  saying  that  the  proposed 
method  of  graduation  is  a  "local"  procedure:  the  graduated  value  of  a  given 
observation  is  not  to  depend  on  other  obeervations  removed  from  it  by  a  dis¬ 
tance  greater  than  m  . 
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Conditions  (3),  (4),  and  (5)  are  not  so  obviously  appropriate,  but  are 
strongly  suggested  by  analogies  to  the  Whittaker  graduation  method.  In  order 
to  make  these  analogies  clear,  that  method  is  briefly  described  in  the  fol¬ 
lowing  section. 

6.  THE  WHITTAKER  ANALOGIES 

The  objective  of  the  Whittaker  process  (Whittaker  1923;  Henderson  1924) 
is  to  choose  graduated  values  u^  (j  «  P,  P  +  1,...,Q)  in  such  a  way  as  to 
minimize  the  quantity 

Q  Q-s  _ 

£  W  (u  -  y.)2  +  9  I  (A8  u.)  ,  (6.1) 

j*p  3  3  3  j-P  3 

where  the  positive  weights  ,  the  positive  constant  g  ,  and  the  positive 
integer  s  are  chosen  a  priori  by  the  user.  The  solution  is  most  conven¬ 
iently  expressed  in  matrix  notation  as  follows  (Greville  1957,  1974a).  Let 
W  denote  the  diagonal  matrix  of  order  N  whose  successive  diagonal  elements 
are  the  weights  W^  ,  let  u  and  y  be  defined  as  in  Section  2,  and  let  K 
be  the  differencing  matrix  of  Section  5.  Then,  the  expression  (6.1)  can  be 
written  in  the  form 

(u  -  y)T  W(u  -  y)  +  g(Ku)T  Xu  .  (6.2) 

It  is  easily  seen  (Greville  1974a)  that  (6.2)  is  smallest  when  u  satisfies 

(W  +  gKT  k)u  -  Wy  .  (6.3) 

It  is  not  difficult  to  show  (Greville  1957,  1974a)  that  the  matrix  in  the  left 
member  of  (6.3)  is  nonsingular  (in  fact,  positive  definite)  and  therefore 

u  -  (W  +  gKT  K)"1  Wy. 

The  Whittaker  method  has  several  interesting  properties.  Conuonly  the 
weight  Wj  is  taken  as  the  reciprocal  of  an  estimate  of  the  variance  of  the 
^th  obeervation.  When  this  is  done,  the  graduated  values  are  constrained  to¬ 
ward  the  observations  where  these  are  reliable,  and  toward  the  form  of  a  poly¬ 
nomial  of  degree  s  -  1  where  the  observations  are  less  reliable.  The  method 
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has  the  interesting  "moment"  property 

Q  Q 

l  w,  JV«,  -  l  w,  jV  (v  -  0,1,..., s  -  l). 

j-P  J  J  j-P  J  J 

Even  in  the  case  of  equal  weights  (equivalent  to  taking  W  -  I)  it  has  been 
found  to  be  a  serviceable  method.  The  ability  to  choose  the  constant  g  at 
will  enables  the  user  to  decide  how  gentle  or  how  drastic  he  wants  the  smooth¬ 
ing  to  be.  The  remaining  discussion  will  be  limited  to  that  case,  so  that 

u  -  (1  +  gKT  K)"1  y. 

Thus ,  in  terms  of  (2.1)  we  have  for  "unweighted"  Whittaker  graduation 
G  -  (1  +  gKT  K)"1  . 

It  is  then  easily  verified  (Noble  1969,  page  147)  that 

r  -i  T  -1 

G«I-K(gI+  KK  )  K. 

This  is  of  the  form  (5.2)  with 

-1  T  -1 

D  -  (g  I  +  KK  )  , 

so  that  property  (3)  of  Theorem  5.1  is  also  a  property  of  the  unweighted 


Whittaker  method.  It  follows  also  that 

-1  -1  „  T 

D  -  g  I  +  KK  , 


(6.4) 


and,  if  we  recall  the  definition  of  K  in  Section  5,  it  is  not  difficult  to 

T  T 

see  that  KK  is  a  Toeplits  matrix.  In  fact,  if  KJC  »  (i^),  we  have 


l ,  where 


*v  ■  • 

.2s, 


with  the  understanding  that  (  ^ )  vanishes  for  j  <  0  and  for  j  >  2s. 


Therefore 


where 


D‘l  ■  ‘V  '  Vi  • 


-1  ,  .  ,  , » v  ,  2s  . 

t  ■  g  6  +  (-1)  (  .  ) , 

v  *  ov  s+v 


(6.5) 


6ov  being  a  Kronecker  symbol.  Thus  the  unweighted  Whittaker  method  has  prop¬ 
erty  (4).  Finally,  in  the  Whittaker  method  given  by  (6.4)  is  a  band 


matrix,  and  therefore  the  series  (5.3)  with  t^  given  by  (6.5)  is  finit.  id 
converges  everywhere  except  at  the  origin.  We  have,  in  fact, 

f  %  zv  -  g-1  +  (-1)3  (z1'2  -  z-1/2)28  . 

v*—00 

As  the  Whittaker  method  is  a  highly  regarded  graduation  procedure,  these 
analogies  constitute  a  strong  argument  in  favor  of  the  natural  extension  of 
MWA  graduation.  Further  arguments  are  provided  by  the  stability  theorem  of 
Section  11  and  the  optimal  property  of  Rq  ("reduction  of  error")  for  the 
top  and  bottom  rows  of  G  described  in  Section  10. 

7.  PROOF  AND  DISCUSSION  OF  THE  MAIN  THEOREM 

Proof  of  Theorem  5.1.  Under  the  hypotheses  of  Theorem  5.1,  we  shall 
first  construct  a  matrix  G  having  the  properties  (1)  to  (10),  and  shall 
then  show  that  it  is  uniquely  determined  by  properties  (1)  to  (5) . 

As  pointed  out  in  Section  3  ,  if  p  is  a  zero  of  q(z),  it  follows 
from  the  symmetry  of  the  coefficients  that  p_1  is  a  zero.  As  there  are  no 
zeros  on  the  unit  circle,  there  are  m  -  s  zeros  inside  the  unit  circle  and 
the  same  number  outside.  As  any  complex  zeros  must  occur  in  conjugate  pairs, 
there  is  a  polynomial  A(z)  with  real  coefficients  whose  zeros  are  the  m  -  s 
zeros  of  q(z)  outside  the  unit  circle.  Moreover,  the  coefficients  of  A(z) 
can  be  normalized  so  that 

q(z)  -  ±A(z)  a(1/z)  .  (7.1) 

m-s  , 

If  a(z)  *  [  a  z3  ,  we  have 

j-0  3 

“r8  2 

%  -  ±  l  •  (7.2) 

j-0  3 

Since,  by  hypothesis,  qQ  >  0  ,  the  positive  sign  holds  in  (7.2)  and  (7.1). 

Now,  let  D  be  given  by  property  (8)  t  that  is,  D  is  the  Trench  matrix 


in  which  the  polynomial  A(s)  of  (7.1)  plays  the  role  of  both  A(x)  and 
■  (x)  in  (5.1).  it  follows  from  (5.1)  that  D  is  symmetric.  As  the  seros  of 
A(s)  are  all  outside  the  unit  circle,  those  of  A(l/s)  are  all  inside.  Thus, 


they  have  no  common  zero,  and  it  follows  {Greville  and  Trench  1979)  that  D 
is  nonsingular  and  D  is  Toeplitz.  This  verifies  property  (4) ,  and  it 
follows  (see  Greville  1979)  that  the  series  (5.3)  is  a  "reciprocal"  of  q(z) 
at  least  in  the  sense  that  if  it  is  formally  multiplied  by  q(z),  the  product 
is  unity,  it  is  shown  in  the  paper  cited  that  the  series  (5.3)  converges  in 
some  part  of  the  complex  plane  if  (and,  under  the  constraints  imposed  by  (5.4), 
only  if)  A (x)  is  chosen  in  the  manner  prescribed  by  property  (8).  it  is 
shown  further  that,  when  this  is  done,  (5.3)  converges  to  [q(z)]_1  in  an 
annulus  bounded  by  circles  of  radii  p  and  p-1,  where  p  is  the  minimum 
absolute  value  of  the  zeros  of  A(x).  This  verifies  properties  (5)  and  (10). 

It  will  be  noted,  incidentally,  that  the  elements  of  D_1  depend  only  on 
A(x)  and  not  on  N  .  Increasing  N  merely  extends  the  sequence  {t  }  with¬ 
out  changing  the  elements  already  determined. 

Now,  let  G  be  given  by  (5.2),  so  that  property  (3)  holds.  The  symmetry 
of  G  follows  from  that  of  D  ,  and  property  (9)  also  follows  (see  Lemma  1 
of  Greville  1980).  property  (7)  is  a  consequence  of  (5.1)  and  property  (1) 
follows  from  (3.2),  (5.1),  and  property  (9).  Property  (2)  follows  from  the 
fact  that  f  is  a  Trench  matrix. 

To  prove  uniqueness  we  assusM  that  properties  (1)  to  (5)  hold.  It  then 
follows  from  properties  (2)  and  (3)  that  D  is  strictly  banded  with  h  -  k  * 
m  -  s.  Because  of  the  structure  of  K  ,  it  is  not  difficult  to  see  that  if  D 
had  a  nonzero  element  outside  the  specified  band,  G  would  necessarily  have 
one  or  more  elements  outside  the  band  prescribed  by  property  (2) .  Since,  by 
property  (4),  D  ia-  a  Toeplitz  inverse ;  it  is  a  Trench  matrix  (Greville  and 
Trench  1979) . 

Let  A  (x)  and  B(x)  be  the  polynosdals  of  degree  m  -  s  associated  with 
D  ■  (d  ) ,  and  let 

m+s 

d(x)  ■  l  d  xV  -  A (x)  B (1/x) . 
v«-m+s 
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Then,  by  (5.1), 


except  in  the  square  suhnatrices  of  order  m  -  s  in  the  upper  left  and 

lower  right  comers  of  D  .  But  it  follows  from  (5.2)  (see  Lemma  1  of  Greville 

T 

1980)  that  F  ■  (f^)  ■  I  -  G  *  K  DK  is  a  (singular)  Trench  matrix  charac¬ 
terised  by  the  polynomials  A(x)  ■  (x  -  l)s  A(x)  and  B(x)  ■  (x  -  l)s  B(x). 

If 

d(x)  *  A(x)  B (1/x)  »  (-1)S  (x1/2  -  x  1^2)2s  d(x)  -  £  xV  , 

v*-m 

it  follows  that 


except  in  the  square  submatrices  of  order  m  in  the  upper  left  and  lower  right 
comers  of  F  .  But  it  follows  from  (3.2)  and  property  (1)  that,  with  the 
possible  exception  of  the  elements  of  the  first  m  and  the  last  m  rows  of 
F  ,  f^  is  the  coefficient  of  x^-i  in  (-l)8^1^2  -  x”1^2)28  q(x).  There¬ 
fore  d (x)  -  q (x) ,  and  so 

dij  "  qj-i  (7*3) 

except  in  the  first  m  -  s  and  the  last  m  -  s  rows  of  D  ,  and 

q(x)  -  A(x)  B (1/x)  .  (7.4) 

Now,  (7.3)  and  the  fact  that  D  is  a  Trench  matrix  (and  has  the  quasi- 
Toeplitz  property)  uniquely  determine  all  its  elements  except  those  of  the  two 
special  corner  submatrices  of  order  m  -  s.  It  has  been  shown  elsewhere 
(Greville  1979)  that,  in  general,  these  corner  elements  can  be  chosen  in  a 
(finite)  number  of  ways,  corresponding  to  the  possible  ways  of  factoring  the 
polynomial  x®”*  q(x)  of  degree  2m  -  2s  as  the  product  of  two  polynomials, 
A(x)  and  *“-*  B (1/x)  of  degree  m  -  s  with  real  coefficients.  (Two  fac¬ 
torizations  are  considered  different  only  if  the  set  of  zeros  of  A(x)  is 
different.)  Property  (8)  prescribes  a  unique  factorisation,  and  therefore 
G  is  uniquely  determined  by  properties  (1)  to  (4)  and  (8) .  But  it  has  been 
shewn  (Greville  1979)  that  if  q(x)  is  given  and  (7.3)  and  (7.4)  hold, 


properties  (8)  and  (5)  are  equivalent.  Thus,  G  is  uniquely  determined  by 
properties  (1)  to  (5) .  ■ 

At  the  beginning  of  Theorem  5.1  two  general  hypotheses  concerning  the 
given  MWA  were  introduced:  that  qQ  is  positive,  and  that  q(z)  has  no 
zeros  on  the  unit  circle.  These  will  now  be  discussed.  It  follows  from  (7.2) 
that  qQ  0  ;  it  is  either  positive  or  negative.  Typically,  the  coefficients 
q^  are  all  positive  and  qQ  is  the  largest.  No  reasonable  MWA  could  have 
qQ  negative.  In  fact,  it  will  be  shown  in  Section  11  that  an  MWA  with  nega¬ 
tive  qQ  cannot  satisfy  Schoenberg's  criterion  for  a  satisfactory  smoothing 
formula . 

MWA's  with  zeros  of  q(z}  on  the  unit  circle  are  also  uncommon  (I  know 
of  no  example  in  the  published  literature) ,  but  it  is  easy  to  construct  such 
formulas,  and  the  presence  of  such  zeros  does  not  of  itself  suggest  any  in¬ 
herent  defect  in  the  MWA.  However,  when  there  are  such  zeros,  property  (5) 
of  Theorem  5.1  cannot  hold  (see  Greville  1979),  and  no  single  preferred  method 
of  extending  the  graduation  to  the  ends  of  the  data  is  clearly  indicated.  The 
case  of  zeros  of  q(z)  on  the  unit  circle  is  of  no  practical  importance,  and 
its  inclusion  in  Theorem  5.1  would  have  made  the  theorem  messy  and  complicated. 
The  curious  reader  might  wish  to  consult  Greville' s  (1979,  1980)  Manitoba 
Conference  papers  for  discussion  of  the  mathematical  questions  involved. 

8.  DEVELOPMENT  OF  THE  NATURAL  METHOD  THROUGH  THE  EXTENSION  ALGORITHM 
This  section  has  two  objectives:  first,  to  show  that  the  extension  algo¬ 
rithm  of  Section  3  is  mathematically  equivalent  to  the  matrix  approach  of 
Sections  5  and  7  ,  and  second,  to  provide  a  heuristic  argument  for  the  natural 
extension  of  the  graduation  that  is  a  partial  alternative  to  the  five  deter¬ 
mining  properties  listed  in  Theorem  5.1. 

Because  of  the  symmetry  of  the  coefficients  q^  of  q(z),  and  the  fact 
that  it  has  no  zeros  on  the  unit  circle,  there  is,  if  qQ  >  0  ,  a  polynomial 


A(z)  of  degree  m  -  s  with  real  coefficients  such  that  (5.4)  holds.  If 


m  >  s  ,  there  is  more  than  one  such  polynomial.  Let  the  polynomial 

m 

A(z)  =  l  a.  zJ  (8.1) 

j=0  3 

of  degree  m  be  defined  by 

A (z)  -  (z  -  1)S  A (z)  .  (8.2) 

Then  it  follows  from  (3.2)  that 

U  =  Y  -  A(E)  ME*1)  .  (8.3) 

AX  X 

Thus  it  would  be  possible  to  perform  the  graduation  by  means  of  the  following 
three  steps: 

1.  Operate  on  the  sequence  {y^}  with  A(E  1)  to  obtain  the  sequence 
of  values  of  A(E  1)  y 

x 

2.  Operate  on  the  latter  with  A(E)  to  obtain  the  sequence  of  values 
-  *  -1 

of  A (E)  A (E  )  yx  ,  which  may  be  thought  of  as  corrections  to  the  observed 

values  y  . 

x 

3.  Subtract  each  of  the  latter  values  from  the  corresponding  value  of 
y  to  obtain  the  graduated  values  u  . 

A  X 

Now,  suppose  the  sequence  {y^}  is  extended  backward  to  x  =  P  -  m  and 
forward  to  x  =  Q  +  m  by  imposing  the  conditions 

A (E)  yx  =  0  (x  =  p  -  1,  p  -  2,  P  -  m)  (8.4) 

and 


-  -1 

A (E  )  yx  -  0  (x  -  Q  +  1,  Q  +  2,  Q  +  m).  (8.5) 

These  extensions  make  it  possible  to  calculate  by  the  main  formula  graduated 

values  ux  over  the  entire  range  x  ■  p,  p  +  1,  Q.  For  x  -  P,  P  +  1, 

...,  P  +  m  -  1,  (8.1)  and  (8.3)  give 

m 


y  -  U  ■  > 

x  X 

j-0 

In  view  of  (8.4)  this  reduces  to 

x-P 


y  -  u 
X  X 


j-0 


5j  V)  • 
5j  »!=)yx_,  . 


(8.6) 
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A  similar  argument  applies  for  x  -  Q,  Q  -  1,  . ..,  Q-m+  1,  with  A(E)  and 


-1 


A (E  )  interchanged,  and  we  have  there 


Q-x 


y  -  u  *  7  oi .  A(E  -■)  y  A  . 

x  x  jto  3  *  3 


(8.7) 


If  we  write  F  =  (f„),  it  follows  from  (8.6)  and  (8.7)  that 


y  -  U  =  Fy  , 

where  F  is  the  (singular)  Trench  matrix  characterized  by  two  polynomials  of 
degree  m  both  equal  to  A(x).  But  in  view  of  property  (8)  of  Theorem  5.1, 
this  is  exactly  the  matrix  F  =  I  -  G  of  property  (9)  of  that  theorem.  Thus 
the  two  approaches  give  the  same  result  if  the  polynomial  A(x)  is  the  same 
in  both  cases.  Therefore  let  us  consider  the  choice  of  A(x)  in  the  present 
context. 


For  a  moment  let  us  think  of  the  sequence  {y^}  as  extended  indefinitely 


to  the  left  of  x  =  P  rather  than  only  as  far  as  x  «*  P  -  m.  Then,  the  gen¬ 
eral  solution  of  (8.*. 

m-s 

yx  =  VX>  +  l  bi  ri  '  (8‘8) 

i*l  J  3 


where  r  ,  r  ,  ...,  r  _  are  the  zeros  of  A(x),  P  (x)  is  an  arbitrary  poly- 


1'  2*  "  ' '  "m-s  - - - - -  *1 

nomial  of  degree  less  than  s  ,  and  b, ,  b. ,  . . . ,  b  are  arbitrary  con- 

l  i  m-s 

stants.  If  A (x)  has  multiple  zeros,  (8.8)  is  replaced  by  a  slightly  differ¬ 
ent  expression,  but  the  end  result  is  the  same. 


Similarly,  the  general  solution  of  (8.5)  is 

m-s 


y  -  P  (x)  +  l  Sj  r 


j 


-x 

i  ’3  *j  * 


(8.9) 


Let  us  now  impose  the  conditions  that  A  y  with  y  given  by  (8.8)  shall  approach 

X  X 


zero  as  x  tends  to  and  that  A  y^  with  y^  given  by  (8.9)  shall  ap¬ 


proach  zero  as  x  tends  to  ®  .  Since  A(x)  must  satisfy  (5.4),  it  is  clear 
that  these  conditions  are  satisfied  if  and  only  if  A(x)  is  chosen  so  that 
its  zeros  are  the  m-s  zeros  of  q(x)  that  are  outside  the  unit  circle. 
Clearly,  this  is  the  choice  of  A(x)  prescribed  by  Theorem  5.1.  Moreover, 
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^ if*  ►  i« fa  u^iii  a  yi«,"^' 


A (x)  chosen  in  this  manner  is  closely  related  to  the  polynomial  p(x)  of 
Section  3.  In  fact, 

A(x)  -  o  xm“8  p (1/x) , 
in— s 

and  consequently, 

A(x)  =  a  xm“s  a (1/x)  , 
m— s 

where  a(x)  is  defined  by  (3.4).  Thus,  with  this  choice  of  A(x),  (8.4)  and 

(8.5)  are  equivalent  to  the  extension  algorithm  of  Section  3. 

We  note  in  passing  that  the  computational  short  cut  involving  extended 

values  has  an  analogue  in  the  case  of  Whittaker  smoothing.  Especially  in 

actuarial  literature,  the  Whittaker  smoothing  process  is  sometimes  called  the 

difference-equation  method  because  the  difference  equation 

u  +  (-1)8  628  u  -  y  (8.10) 

X  X  X 

holds  for  x-P  +  s,  P  +  6  +  1,  . ..,  Q  -  s.  Aitken  (1926)  pointed  out  that 
(8.10)  is  satisfied  for  x  «  P,  P  +  1,  . . . ,  Q  if  we  annex  at  each  end  of  the 
data  set  s  extrapolated  values  of  both  and  ux  satisfying  the  condi¬ 

tions 

ux  "  yx  (x*P-j»x*Q+j>  j“l»2,  ...,  s), 

A8  ux  ■  0  (x  -  P  -  j,  x  -  Q  -  j;  j  -  1,  2,  . . . ,  s) . 

However,  this  observation  is  not  helpful  from  a  computational  point  of  view. 
The  attempt  to  utilize  it  merely  increases  from  N  to  N  +  2s  the  order  of 
the  linear  system  to  be  solved. 

A  final  comment  regarding  motivation  of  the  extension  algorithm  may  be 
in  order.  Elsewhere  (Greville  1957)  I  referred  to  the  notion  of  the  "smooth 
space."  This  is  the  space  of  vectors  y  such  that  Gy  ■  y  ,  the  space  of 
vectors  that  are  unchanged  by  the  graduation  process.  For  the  Whittaker  proc¬ 
ess  it  is  the  space  of  "polynomial  vectors”  of  degree  less  than  s  . 
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In  a  strict  mathematical  sense,  the  smooth  space  is  the  same  for  the 
graduation  procedure  considered  here,  but  in  reality  the  situation  is  more 
complicated.  Equation  (8.3)  shows  that  if  (8.4)  should  hold  "across  the 
board,"  all  the  graduated  values,  to  the  extent  that  they  can  be  calculated 
by  the  main  formula  without  extension,  would  be  equal  to  the  observed  values. 

i 

Of  course,  the  result  would  be  similar  if  (8.5)  should  hold  "across  the  board." 
Now,  the  two  conditions  (8.4)  and  (8.5)  are  not  equivalent.  The  corresponding 
conditions  in  the  Whittaker  case  are  the  vanishing  of  the  sth  finite  differ¬ 
ences  of  the  observed  values,  which  are  equivalent  because  of  the  symmetry  of 
the  (binomial)  coefficients  in  the  expressions  for  these  finite  differences. 

The  coefficients  of  A(x)  have  no  such  symmetry. 

These  observations  suggest  that  the  true  analogue  of  the  Whittaker  proc¬ 
ess  is  arrived  at  by  using  the  different  criteria  (8.4)  and  (8.5)  at  the  two 
ends  of  the  data.  As  previously  noted,  there  are  in  general  different  ways 
of  choosing  A(x)  so  that 

A(E)  A^1)  -  (-1)S  62s  q(E)  , 

and  we  have  made  the  unique  choice  that  makes  the  extension  a  "stable"  opera¬ 
tion  at  both  ends. 

9.  SPECIAL  CLASSES  OF  MOVING  AVERAGES 
Of  particular  interest  are  those  moving  averages  known  to  actuaries  as 
minimum-R^  formulas  and  to  economic  statisticians  as  "Henderson's  ideal" 
formulas.  For  a  given  number  of  terms  2m  +  1,  this  is  the  average  (1.2), 


exact  for  the  third  degree,  for  which  the  quantity 

5  3  2 

l  (A  c  r  (9.1) 

j»-m-3  ^ 

is  smallest  (with  the  understanding  that  c,  -  o  for  j  j|  >  m) .  The  "smoothing 
coefficient"  is  defined  as  the  quantity  obtained  by  dividing  (9.1)  by  20 
and  talcing  the  square  root.  The  divisor  20  is  chosen  because  this  is  the 


value  of  (9.1)  for  the  trivial  case  of  (1.2)  in  which  c  *  1  and  c  -  0 

o  J 

for  j  +  0. 

The  rationale  for  minimizing  (9.1)  may  be  explained  as  follows  (Greville 
1974a).  If,  for  some  x  ,  u  ,  u  ,  u  ,  and  u  ,  are  all  given  by  (1.2) 

X  Xt  X  At  &  Xt  «3 

(which  is  the  case  for  x=P+m  to  Q  -  m  -  3,  inclusive),  then 

a3  »x  -  -  z  y  C  >  y  .  (9.2) 

3=-m-3 

It  has  been  customary  to  regard  the  smallness  (in  absolute  value)  of  the  third 
differences  of  the  graduated  values  as  am  indication  of  smoothness.  Therefore 

3 

(9.2)  suggests  that  smoothness  is  encouraged  by  making  the  quantities  a  c  . 

numerically  small,  and  minimizing  (9.1)  is  a  way  of  doing  this.  The  formula 

corresponding  to  (9.2)  for  a  general  order  of  differences  is 

A3  u  -  (-1)S  l  (A3  c  )  y  ,  (9.3) 

X  j— m-s  3  x+3+s 

and  the  general  formula  for  Rs  is 

R*  -  l  (AS  c.)2/(23).  (9.4) 

j— m-s  3 

There  is  some  question  whether  Henderson's  contribution  warrants  attach¬ 
ing  his  name  to  the  "ideal"  weighted  averages.  De  Forest  (1873)  treated  ex¬ 
tensively  the  formulas  that  minimize  R^  .  The  concept  of  choosing  the  co¬ 
efficients  Cj  in  order  to  minimize  R^  seems  to  have  been  first  mentioned 
by  G.  F.  Hardy  (1909) .  These  averages  were  fully  discussed  by  Sheppard  (1913) 
slightly  earlier  than  by  Henderson  (1916).  However,  Henderson  does  seem  to 
have  been  the  first  to  give  an  explicit  formula  for  the  coefficient  c^  in 
the  weighted  average  minimizing  R^  (Henderson  1916,  p.  43;  Macaulay  1931, 
p.  54j  Henderson  1938,  p.  60;  Miller  1946,  p.  71;  Greville  1974a,  p.  18).  If 
we  write  k  ■  m  +  2,  so  that  the  weighted  average  has  2k  -  3  terms,  the  for¬ 
mula  is 

„  .  315[(k  -  l)2  -  j21(k2  -  i2) t (k  +  l)2  -  121(3k2  -  16  -  llj2) 


It* 
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Weighted  averages  that  minimize  Rg  have  been  discussed  from  other  points 
of  view  by  Wolfenden  (1925),  Schoenberg  (1946),  and  Greville  (1966,  1974b). 

Also  deserving  of  special  mention  are  the  averages  (exact  for  cubics)  that 
minimize  Rq/  sometimes  called  "formulas  of  maximum  weight"  or "Sheppard • s  ideal" 
formulas.  These  are  sometimes  applied  to  physical  measurements  when  the  errors 
of  observation  can  be  regarded  as  random  "white  noise"  (see  discussion  of  "re¬ 
duction  of  error"  in  Section  10) .  The  weights  sure  given  by 

3 (3m2  +  3m  -  1)  -  15j2 
Cj  "  (2m  -  1) (2m  +  1) (2m  +  3)  * 

Weighting  coefficients  c^  and  extension  coefficients  a..  for  minimum-R^ 
(Henderson's  ideal)  averages  of  5,  7,  ...»  23  terms  are  given  in  Table  2. 

10.  COMPARISON  WITH  OTHER  METHODS.  PRACTICAL  CONSIDERATIONS 

If  a  symmetrical  MWA  exact  for  the  degree  2s  -  1  is  being  used  to  smooth 
the  main  part  of  the  data,  it  can  easily  be  deduced,  either  from  the  extension 
algorithm  described  in  Sections  3  and  8  or  from  the  matrix  formulation  of 
Theorem  5.1  that  the  unsymmetrical  weightings  proposed  for  smoothing  the  first 
m  and  the  last  m  observations  are  exact  only  for  the  degree  s  -  1.  For 
example,  all  the  averages  represented  in  Tables  2  and  3  with  the  exception  of 
Hardy's  are  exact  for  cubics,  and  therefore  their  extensions  to  the  ends  are 
exact  only  for  linear  functions.  Hardy's  weighted  average  is  exact  for  linear 
functions  and  its  extension  only  for  constants. 

The  Whittaker  process  has  a  similar  property.  At  a  sufficient  distance 
from  the  ends  of  the  data,  polynomials  of  degree  2s  -  1  are  "almost"  re¬ 
produced  by  that  process.  In  support  of  this  rather  loose  statement  the  fol¬ 
lowing  heuristic  argument  is  advanced.  For  the  Whittaker  process 

G  -  (I  +  gKT  K)’1  -  I  -  gGKT  K. 

Thus,  if  y  is  the  vector  of  observed  values,  the  vector  of  corrections  to 
these  values  is 

-gGKT  Ky. 


T 

Now,  the  nonzero  elements  of  K  K,  with  the  exception  of  the  first  s  and 

the  last  s  rows,  are  binomial  coefficients  of  order  2s  with  alternating 

T 

signs.  Therefore  the  components  of  K  Ky,  except  for  the  first  s  and  the 

last  s  ,  are  (2s) th  differences  of  those  of  y  (or  their  negatives  if  s  is 

odd).  Thus,  if  y  is  a  vector  of  equally  spaced  ordinates  of  a  polynomial 

T 

of  degree  2s  -  1,  K  Ky  is  a  vector  of  zeros  except  for  the  first  s  and 

T 

the  last  s  components.  The  components  of  GK  Ky  are  graduated  values  of 
T 

those  of  K  Ky,  and  therefore  should  be  very  small  at  some  distance  from  the 
extremities  of  the  data.  Finally,  multiplication  by  g  ,  even  though  g  is 
typically  large,  should  give  small  corrections  at  a  sufficient  distance  from 
the  ends  of  the  data. 

Some  users  may  consider  the  reduction  in  degree  of  exactness  near  the 
ends  a  disadvantage  of  the  natural  method  of  extension.  Before  I  became  aware 
of  the  natural  method, 7  had  proposed  (Greville  1974a)  a  different  method  of 
extension  (already  mentioned  in  Section  2)  that  does  not  have  this  particular 
disadvantage  (though  it  has  other  shortcomings) .  This  involves  extrapolation 
by  a  polynomial  of  degree  2s  -  1  fitted  by  least  squares  to  the  first  m  +  1 
observations.  A  similar  polynomial  is  fitted  to  the  last  m  +  1  observations 
for  extrapolation  at  the  other  end  of  the  data.  There  may  be  a  gain  in  sim¬ 
plicity  by  using  a  single  method  of  extrapolation  for  all  symmetrical  weighted 
averages,  so  that  the  extrapolated  values  depend  only  on  the  niznber  of  terms 
in  the  main  formula.  However,  there  is  a  loss  in  that  the  extension  method 
is  no  longer  tailored  to  the  particular  symmetrical  average  used. 

Like  the  natural  method  of  extension,  the  method  using  extrapolation  by 
least  squares  can  be  collapsed  into  a  single  matrix  G  .  When  this  is  done, 
the  band  character  of  the  smoothing  matrix  is  maintained,  but  the  syonetry  is 
lost.  Though  the  matrix  approach  is  less  convenient  for  computational 
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purposes ,  the  differences  between  the  two  methods  are  best  elucidated  by 


comparing  the  first  m  rows  of  the  respective  matrices  G  .  This  is  done 
in  Tables  4  and  5  for  the  case  of  the  9-term  "ideal"  formula.  Here  m  =  4, 
but  for  convenience  the  fifth  row  is  also  shown.  Its  elements  would  be  re¬ 
peated  in  the  subsequent  rows,  moving  successively  to  the  right,  until  we 
come  to  the  last  four  rows.  While  an  average  of  as  few  as  9  terms  would  sel¬ 
dom  be  used  in  practice,  this  is  a  convenient  illustration. 

As  previously  indicated,  the  first  m  and  the  last  m  rows  of  G  may 
be  regarded  as  exhibiting  unsymmetrical  weighted  averages  which  are  to  be 
used  near  the  ends  of  the  data  to  supplement  the  symmetrical  average  used 
elsewhere.  The  coefficients  that  appear  in  the  last  m  rows  are  the  same  as 
those  in  the  first  m  rows,  but  the  order  is  reversed,  both  horizontally  and 
vertically.  It  should  be  noted  that  the  coefficients  in  the  supplemental 
averages  depend  only  on  those  of  the  underlying  symmetrical  average.  They  do 
not  depend  on  N  ,  the  number  of  observations  in  the  data  set  (which  is  the 
order  of  G  ) . 

The  coefficients  in  the  supplemental  weighted  averages  based  on  least- 
squares  extrapolation,  exhibited  in  Table  5,  show  two  undesirable  features. 
These  are  negative  coefficients  of  substantial  numerical  magnitude,  and  suc¬ 
cessive  waves  of  positive  and  negative  coefficients  as  one  proceeds  from  left 
to  right  along  the  rows.  The  number  of  such  waves  would  increase  as  the  num¬ 
ber  of  terms  in  the  underlying  formula  increases. 

in  striking  contrast  is  the  character  of  the  coefficients  of  the  natural 
extension.  Like  the  coefficients  in  the  underlying  symmetrical  formula,  each 
row  exhibits  4  peak  in  the  vicinity  of  the  main  diagonal  of  the  matrix,  taper¬ 
ing  off  to  a  single  group  of  negative  coefficients  of  reduced  sise  near  the 
edge  of  the  band. 
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In  the  least-squares  method  only  a  very  small  correction  is  made  t_>  th 
initial  observed  value.  The  corresponding  correction  in  the  natural  methc 


is  more  substantial. 

The  "second-difference  correction"  is  the  coefficient  of  the  second- 
difference  term  when  the  formula  is  expressed  in  terms  of  increasing  orders 
of  differences  in  the  form 

.2 

u  =  y  +cA  y  ^  +  . . . 
x  x  x-h 

The  coefficient  c  does  not  depend  on  the  subscript  x  -  h,  in  which  there 
is  some  freedom  of  choice.  For  the  formulas  based  on  least-squares  extra¬ 
polation,  which  are  exact  for  cubics ,  the  fourth-difference  correction  is 
similarly  defined. 

Some  writers  (Miller  1946;  Wolfenden  1942;  Greville  1974a)  have  regarded 

the  observed  values  y^  as  the  sum  of  "true"  values  Ux  and  superimposed 

random  errors  ex  .  If  it  is  assumed  that  the  errors  e^  for  different  x 

2 

are  uncorrelated,  and  have  zero  mean  and  constant  variance  a  for  all  x  , 

2  2  2 

then  the  variance  of  the  error  in  the  smoothed  value  u  is  R  a  ,  where  R 

x  o  o 

is  obtained  by  taking  s  «  0  in  (9.4).  Thus,  R  may  be  interpreted  as  the 

o 

ratio  of  reduction  in  the  standard  deviation  of  error  that  results  from  appli¬ 


cation  of  the  weighted  average. 


While  the  assumptions  underlying  the  preceding  analysis  may  be  ques¬ 
tioned,  nevertheless  a  good  case  can  be  made  that,  for  any  weighted  average, 

2 

Rq  should  be  less  than  unity.  Since  Rq  is  the  sun  of  the  squares  of  the 
coefficients  in  the  average,  Rq  c«m  never  be  less  than  the  maximum  of  the 
absolute  values  of  the  coefficients.  Thus,  an  average  cannot  be  considered 


satisfactory  if  the  absolute  value  of  any  coefficient  is  equal  to  or  greater 
than  unity. 


30 


When  the  graduation  is  extended  to  the  extremities  of  the  data,  these 
remarks  apply  not  only  to  the  main  formula  but  also  to  the  unsymmetrical  for¬ 
mulas  to  be  used  near  the  ends.  Tables  4  and  5  illustrate  the  fact  that  there 
is  a  strong  tendency  for  R^  to  become  large  as  we  approach  the  extremities 
◦f  the  data. 

In  this  connection,  the  natural  extension  has  an  important  optimal  prop¬ 
erty.  Let  us  suppose  that  the  main  formula  is  given  and  satisfies  the  general 
hypotheses  of  Theorem  5.1.  That  is,  it  is  symmetrical,  qQ  >  0  ,  and  q(z) 
has  no  zeros  on  the  unit  circle.  Further  we  suppose  that  cq  >  -1.  The  latter 
assumption  is  not  a  strong  one;  a  negative  value  of  cq  is  most  unusual  in 
any  case,  and  we  have  previously  stated  that  an  average  is  not  satisfactory 
if  the  absolute  value  of  any  coefficient  is  equal  to  or  greater  than  unity. 

In  addition  we  assume  that  G  is  symmetric  and  has  properties  (1)  to  (4)  of 
Theorem  5.1.  As  we  shall  see  in  Section  11,  there  are  cogent  reasons  for 
thinking  that  G  should  be  symmetric. 

Under  these  conditions  we  have  seen  that  in  general  there  are  a  number 
of  possible  choices  of  the  polynomial  A(x).  It  will  be  shown  that  Rq  for 
the  top  row  of  G  is  smallest  when  A(x)  is  chosen  in  the  unique  manner  pre¬ 
scribed  by  Theorem  5.1. 


(10.1) 
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Now,  since  0  *  a 

o  o 

2 

cessively,  1  -  aQ  , 
top  row  is  given  by 


the  nonzero  elements  in  the  top  row  of  G 

-a  S.  ,  -a  cL,  . ..,  -a  a  .  Therefore, 
o  l  02  o  m 


are,  suc- 
2 

R  for  this 
o 


_ 2  a  _ a  2  a  2  a  2  i  _  > 

R  *  1  -  2a  +  a  S*l-a(l  +  c), 
o  o  o  o  o 


(10.2) 


where  s  denotes  the  summation  contained  in  (10.1).  Now,  let  r^,  r2»  ..., 


r  denote  the  zeros  of  A (x) ,  so  that 
m-s 

m-s 


A(x)  *  a  „  TT  (x  ~  r.)  , 
m-s  jal  3 


and  let 


Then, 


m-s 


A  =  <-l)m+1  TT  r,  . 

j-1  3 


a 

=  -Aa  , 

ii 

<e 

a  t 

o 

m-s 

m 

m-s 

A 

-  ,  2 

,2  2 

*  -a 

a  =  Aa  , 

a  “ 

A  a 

o 

m  m-s 

o 

(0 

a2 

so  that  a  -  Ac  ,  and  (10.2)  becomes 
o  m 

R2  -  1  -  Ac  (1  +  c  )  . 
o  mo 

In  this  expression,  c  and  c  are  given;  A  is  the  only  variable*  More- 

o  m 

*2 

over,  Ac  ■  a  is  positive,  and  1  +  c  is  positive,  since  c  >  -1. 
mo  o  o 

2 

Therefore  Rq  is  smallest  when  | A |  is  largest ,  which  is  clearly  the  case 
when  the  zeros  of  A(x)  are  the  m-s  zeros  of  q(x)  that  are  largest  in 
absolute  value,  namely,  those  outside  the  unit  circle. 

Thus  the  smoothing  matrix  G  of  the  natural  extension  would  still  be 
uniquely  determined  if,  in  Theorem  5.1,  we  replaced  property  (5)  by  the  re¬ 
quirements  that  G  be  symmetric  and  that  Rq  for  the  top  row  of  G  be  as 
small  as  possible  subject  to  the  other  conditions  imposed.  It  appears  that 
the  requirement  that  G  be  symMtric  can  be  dropped  if  stronger  conditions 
are  iaposed  on  q(z) ,  but  the  algebraic  complication  of  the  proof  is  greatly 
increased. 

As  indicated  in  Section  9,  it  has  long  been  custoamry  to  regard  a  gradua¬ 
tion  as  smooth  if  the  third  differencee  of  the  graduated  values  are  small  in 


absolute  value.  If  G  «  (g.  .),  we  have 


Vi-i  ■  9ij  yP+j-i  • 


and  therefore 


1  Vi-i  ’  .1,  Vj-i  4i  ■  (10-3’ 

j=l  J  J 

where  the  subscript  of  A  indicates  that  the  differences  are  taken  with  re¬ 
spect  to  i  (i.e.,  down  the  columns  of  the  matrix).  If  one  avoids  the  corner 
submatrices ,  the  nonzero  elements  g„  in  (10.3)  are  merely  coefficients  in 
the  underlying  symmetrical  average,  and  (10.3)  reduces  to  (9.3).  This  was 
the  rationale  underlying  the  derivation  of  the  minimum-Rg  averages. 

Of  course,  if  G  is  symmetric,  it  makes  no  difference  whether  the  dif¬ 
ferences  are  taken  horizontally  or  vertically,  when  the  symnetry  of  G  is 
not  assisted,  care  must  be  exercised.  Many  years  ago  (Greville  1947,  1948)  i 
published  what  purported  to  be  coefficients  in  supplemental  averages  to  be 
used  near  the  ends  of  the  data  in  conjunction  with  minimum-R^  and  minimum-R^ 
symmetrical  averages.  The  symmetry  of  G  was  not  assumed,  and  I  made  the 
error  of  deriving  the  unsymmetrical  coefficients  by  minimizing  their  third  and 
fourth  differences  taken  horizontally.  The  tables  in  question  are  therefore 
based  on  an  incorrect  assumption.  Further  it  may  be  mentioned  in  passing  that 
in  the  1947-8  formulation  the  diagonal  band  character  was  not  maintained, 
since  the  supplemental  averages  contained  the  full  2m  +  1  terms. 

Table  6  shows,  for  the  natural  and  least-squares  extensions  of  the  9- term 
minimum-R^  formula,  those  third  differences  of  the  matrix  elements,  taken 
vertically,  that  involve  elements  of  the  first  five  rows.  The  entries  in  the 
fifth  row  of  Table  6  would  be  repeated  in  subsequent  rows,  moving  successively 
to  the  right.  Casual  inspection  of  the  table  shows  that  the  third  differences 
are  msMrically  smaller  for  the  natural  extension.  All  of  these  third  differ¬ 
ences  are  less  than  0.14  in  absolute  value.  Two  of  those  for  the  least-squares 
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extension  exceed  0.7  in  absolute  value. 


It  is  instructive  to  compare  the  natural  extension  with  the  least- 
squares  extension  for  the  numerical  example  of  Section  3.  Though  neither 
extension  is  recommended  for  use  when  additional  data  are  available  beyond 
the  range  of  the  original  data  set,  nevertheless  it  may  be  of  interest, 
purely  for  purposes  of  illustration,  to  choose  a  numerical  example  in  which 
such  additional  data  are  available,  and  this  has  been  done. 

Table  7  and  Figures  B  and  c  complement  Table  1  and  Figure  A,  showing, 
for  the  first  seven  months  of  1967  and  the  last  seven  months  of  1971,  the 
observed  values  of  precipitation  in  Madison,  Wisconsin,  and  the  graduated 
values  obtained  by  (i)  natural  extension  of  Spencer's  15-term  average, 

(ii)  least-squares  extension  of  the  same  average,  and  (iii)  use  of  additional 
data.  It  will  be  noted  that  the  least-squares  extension  is  strongly  con¬ 
strained  toward  each  of  the  two  terminal  observations  (January  1967  and 

December  1971) .  This  may  be  explained  by  the  fact  that  all  the  values  yx 

in  (1.2)  that  entered  into  the  calculation  of  these  graduated  values  are  in¬ 
cluded  in  either  the  m  +  1  observations  to  which  the  least-squares  cubic 

was  fitted  or  the  m  extrapolated  values  obtained  from  the  same  cubic.  On 

the  other  hand,  the  natural  extension  and  the  least-squares  extension  are  very 
close  together  at  the  interface  with  the  graduated  values  calculated  in  the 
standard  manner.  Thus,  for  the  months  of  July  1967  and  June  1971,  all  but  one 
of  the  values  entering  into  the  computation  (1.2)  are  identical  for  the 

two  methods. 

For  the  months  closer  to  the  interface  the  graduated  values  obtained  by 
introducing  additional  data  are  close  to  those  of  the  natural  extension.  This 
is  because  the  supplemental  unsymnetrical  averages  produced  by  the  natural  ex¬ 
tension  (unlike  those  of  the  least-squares  extension)  give  relatively  small 
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weight  to  the  observations  more  remote  from  the  one  being  graduated  (as  does 
the  underlying  symmetrical  formula) .  For  example,  the  values  for  the  natural 
extension  and  those  obtained  by  the  use  of  additional  data  are  indistinguish¬ 
able  in  Figure  B  for  April  to  July  1967.  In  the  last  months  of  1971  the  devia¬ 
tion  is  greater  because  the  first  two  months  of  1972  were  exceptionally  dry. 
This  could  not  have  been  predicted  from  the  data  for  preceding  months. 

Table  8  gives  certain  parameters  for  the  various  symmetrical  weighted 
averages  that  have  been  mentioned  previously.  The  column  headed  "Error"  re¬ 
quires  explanation.  This  is  the  error  committed  when  the  formula  in  question 
is  used  to  "smooth"  a  polynomial  of  degree  4  .  This  naturally  tends  to  in¬ 
crease  with  the  number  of  terms  in  the  formula.  Both  R  and  R  tend  to 

o  3 

decrease  with  increasing  number  of  terms.  Though  the  "ideal"  formulas  have 

been  derived  to  minimize  R^  ,  they  tend  to  produce  small  values  of  Rq  as 

well.  In  only  one  instance  (Vaughan)  does  a  "name"  formula  have  a  smaller 

r  than  the  "ideal"  formula  of  the  same  number  of  terms.  The  late  Hubert 
o 

Vaughan  was  a  remarkably  keen  analyst  of  MWA  smoothing. 

It  may  be  mentioned  in  passing  that  some  writers  (e.g.,  Henderson  1938) 

2 

call  the  reciprocal  of  Rq  the  "weight"  and  the  reciprocal  of  R^  the 
( smoothing )  "power . " 

11.  THE  STABILITY  THEOREM 

Schoenberg  (1946)  defined  the  characteristic  function  of  the  MWA  (1.2) 
as 

$(t)  -  l  c  eijt  .  (11.1) 

j*-m  ^ 

For  a  symmetrical  MWA  this  is  a  real  function  of  the  real  variable  t  ,  and 

can  be  expressed  in  the  alternative  form 

m 

$(t)  »  £  c,  cos  jt  . 

j«-m  ^ 

It  is  periodic  with  period  2ir  and  equal  to  unity  for  t  *  2irn  for  all 
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integers  n  . 


The  effect  of  MWA's  in  eliminating  or  reducing  certain  waves  has  been 
noted  (Elphinstone  1951;  Hannan  1970).  If  the  input  to  the  smoothing  process 
is  a  sine  wave,  which  may  be  represented  in  the  form 

yx  =  C  cos(rx  +  h)  ,  (11.2) 

it  can  be  shown  by  simple  algebraic  manipulation  that 

ux  =  yx  • 

where  u  =  2ir/r  is  the  period  of  y^  .  Thus ,  if  <|>  (2tt/w)  =  0  ,  the  wave  is 
annihilated  by  the  smoothing  process;  the  amplitude  is  severely  reduced  if  it 
is  close  to  zero.  Thus  MWA  smoothing  is  related  to  the  "filtering"  processes 
considered  by  Wiener  (1949)  and  others. 

Schoenberg  (1946)  defined  a  smoothing  formula  as  an  MWA  whose  character¬ 
istic  function  $(t)  satisfies  the  condition 

| ♦  (t) |  <  1  (11.3) 

for  all  t.  Thontde  (1965)  calls  (11.3)  "von  Neumann’s  condition"  without, 
however,  citing  any  specific  publication  of  von  Neumann.  Later  Schoenberg 
(1948,  1953)  suggested  the  stronger  condition 

| ♦ (t) |  <1  (0  <  t  <  2n) .  (11.4) 

Lanczos  (see  Schoenberg  1953)  pointed  out  that  (11.4)  is  obtained  by  requiring 
that  every  simple  vibration  (11.2)  be  diminished  in  amplitude  by  the  transfor¬ 
mation  (1.2) .  The  results  of  Section  5  of  the  present  paper  suggest  an  alter¬ 
native  definition  of  a  smoothing  formula.  Using  the  subscript  N  to  empha¬ 
size  the  fact  that  the  order  of  G  is  the  number  of  observations  in  the  data 
set,  we  shall  say  that  an  extension  of  (1.2)  by  means  of  a  smoothing  matrix 
G  is  stable  if  the  limit 

eo  n 

G  -  lim  G 
N  N 

n  +  » 

exists  for  all  N  .  Schoenberg  (1953,  footnote  3)  suggested  a  relationship 
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between  (11.3)  and  the  conditions  for  existence  of  the  infinite  power  of  a 
matrix  (Oldenburger  1940 »  Dresden  1942) ,  but  he  did  not  elaborate  the  connec¬ 
tion.  in  the  theorem  of  this  section  we  shall  attempt  to  do  so. 

In  Section  7  we  promised  to  justify  the  hypothesis  in  Theorem  5.1  that 
qQ  >  0  by  showing  that  if  the  characteristic  function  of  a  symmetric  MWA 
satisfies  (11.3)  and  q(z)  has  no  zeros  on  the  unit  circle,  then  qQ  >  0  . 
Consider  the  real  function 

iMt)  -  1  -  $(t)  (11.5) 

and  note  that  (11.3)  is  equivalent  to 

0  <_  iMt)  12  (11.6) 

for  all  t.  From  (3.1),  (3.2),  and  (11.1)  it  follows  that 

♦  (t)  -  (-1) *  (2i  sin  it)28q(eit)  -  (4sin2  Jt)S  qte1*)  ,  (11.7) 

and  therefore  (11.6)  implies  that  q(e^t)  is  nonnegative  for  0  <  t  ■«  2fl.  In 
fact,  it  is  positive,  since  q(z)  has  no  zeros  on  the  unit  circle,  and  by 
continuity  it  is  positive  for  t  -  0  as  well.  In  other  words,  q(l)  •»  0  . 

Now  let  the  polynomials  A(x)  and  B(x)  be  chosen  so  that  q(x)  -  A(x)  B(l/x) 
and  the  zeros  of  B(l/x)  are  the  reciprocals  of  those  of  A(x) .  This  is  alway 
possible  because  of  the  symmetry  of  the  coefficients  of  q(x).  Moreover,  the 
coefficients  in  those  polynomials  can  be  normalized,  as  in  the  proof  of  Theorem 
5.1,  so  that  (7.1)  and  (7.2)  hold,  and  therefore 

q(l)  -  ± (A (1) ) 2 . 

Since  q(l)  is  positive,  the  positive  sign  holds  in  (7.1)  and  (7.2),  and  con¬ 
sequently  qQ  >  0  . 

Before  stating  the  theorem  that  elucidates  the  relationship  of  condition 
(11.3)  to  the  smoothing  matrix  G  ,  we  need  to  describe  certain  results  pub¬ 
lished  elsewhere  that  will  be  used  in  the  proof.  In  a  recent  paper  (Greville 
1980}  1  have  studied  bounds  for  eigenvalues  of  Hermitian  Trench  matrices 
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(which  become  symmetric  Trench  matrices  when  the  elements  are  real) .  it  :u 
polynomials  that  characterize  a  real  symmetric  Trench  matrix  H  are  A(x) 
and  B(x)  of  degree  h  ,  we  have  seen  that  the  coefficients  can  be  normalize 
so  that  either  B(x)  =  A(x)  or  B(x)  =  -A(x).  If  the  minus  sign  holds,  one 
can  consider  the  symmetric  Trench  matrix  -H.  It  is  sufficient,  therefore, 
to  consider  the  case  in  which  B(x)  =  A(x). 

Let  A (x)  be  given  and  consider  the  family  of  symmetric  Trench  matrices 
Hn  of  order  N  (N  >_  2h  +  1)  characterized  by  A(x)  and  B(x)  **  A(x).  Let 

GN  "  1  “  ' 

where  p  is  a  positive  constant,  and  let 

h  (x)  »  A  (x)  A  (1/x) . 

Then  it  is  shown  that  h(x)  is  real  and  nonnegative  on  the  unit  circle,  and 
has  a  maximum  thereon,  which  we  denote  by  M.  Then  Corollary  1  of  the  cited 

CO 

paper  states  that  the  limit  exists  for  all  N  if  and  only  if 

p  <_  2/M 

and  no  zero  of  A(x)  is  inside  the  unit  circle  unless  it  is  also  a  zero  of 

A (1/x).  A  particular  application  of  Lemma  1  of  the  same  paper  yields  the 

result  that  if  D  is  a  Trench  matrix  characterized  by  the  polynomials  A(x) 

T 

and  B (x) ,  then  K  DK  (with  K  defined  as  in  Section  5)  is  a  (singular) 
Trench  matrix  characterized  by  the  polynomials  A(x)  *  (x  -  l)s  a(x)  and 
B(x)  *  (x  -  l)s  B(x).  For  convenience  in  the  proof  of  the  theorem  that  fol¬ 
lows,  we  shall  refer  to  Corollary  1  and  Lemma  1  of  the  paper  cited  as  merely 
"Corollary  1"  and  "Lemma  1." 

Theorem  11.1.  Let  a  symmetrical  MWA  (1.2)  be  given  and  let  the  asso¬ 
ciated  smoothing  matrix  G^  for  all  N  >_  2m  +  1  be  symmetric  and  have 
properties  (1)  to  (4)  of  Theorem  5.1.  Then  the  family  of  matrices  G„  is 
stable  if  and  only  if  (11.3)  holds  and  the  polynomial  A(x)  associated  with 
the  matrix  D  has  no  zero  inside  the  unit  circle. 


Proof .  From  the  hypotheses  stated  in  the  first  sentence  of  the  theorem 
we  can  deduce  certain  properties  of  the  matrices  F  and  D  by  reasoning 
similar  to  that  used  in  the  uniqueness  part  of  the  proof  of  Theorem  5.1. 

First  we  note  that  the  hypotheses  of  the  present  theorem  differ  slightly  from 
those  of  Theorem  5.1.  We  have  added  the  hypothesis  that  G  is  symmetric, 
and  have  omitted  the  restrictions  on  q(x).  However,  the  reader  should  note 
carefully  that  the  latter  omission  is  occasioned  only  by  the  fact  that  these 
restrictions  are  implied  by  the  symmetry  of  G  in  conjunction  with  other  hypo¬ 
theses.  The  symmetry  of  G  implies  that  of  F  .  As  the  rows  of  K  are  lin¬ 
early  independent,  it  has  full  row  rank  and  therefore  has  a  left  inverse,  say 

L  (see  Ben-Israel  and  Greville  1974,  Lemma  1.2).  Therefore, 

T  T  T 

L  FL  =  L  K  DKL  =  D  , 

and  consequently  D  is  symmetric. 

As  in  the  uniqueness  part  of  the  proof  of  Theorem  5.1,  it  follows  from 
property  (4)  that  D  is  a  nonsingular  Trench  matrix.  If  it  is  characterized 
by  the  two  polynomials  A(x)  and  B(x)  of  degree  m  -  s,  then 

q(x)  *  A(x)  B (1/x)  , 

as  in  the  earlier  proof.  As  D  is  real  and  symmetric,  the  coefficients  in 
these  polynomials  are  real  and  can  be  normalized  so  that 

B(x)  -  ±A(x) .  (11.8) 

As  we  have  omitted  the  hypothesis  that  qQ  >  0  ,  some  ambiguity  remains  about 
the  sign  of  the  right  member  of  (11.8)  until  further  hypotheses  are  introduced, 
and  we  have 

q(x)  -  ±A(x)  A(l/x).  (11.9) 

Now,  the  symmetry  and  nonsingularity  of  D  and  the  requirement  that 
A (x)  have  real  coefficients  imply  that  q(x)  has  no  zeros  on  the  unit  circle. 
As  we  have  seen,  symmetry  of  D  implies  (11.9),  and  nonsingularity  implies 
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(Greville  and  Trench  1979)  that  A(x)  and  A(l/x)  have  no  common  zero.  Now 


if  q(x)  has  a  zero  on  the  unit  circle,  say  x  ,  then  x”1  is  also  on  the 

o  o 

unit  circle,  so  that  A(x)  must  have  a  zero  on  the  unit  circle;  call  it  p  . 
Then  p  ^  is  a  zero  of  A(l/x) ,  and  p  is  a  zero  of  A(x),  since  A(x)  has 
real  coefficients.  But  p  =*  p  therefore  A(x)  and  A(l/x)  have  a  common 

zero.  Thus  the  supposition  that  q(x)  has  a  zero  on  the  unit  circle  is  false 


Now,  suppose  that  (11.3)  holds  and  A(x)  has  no  zeros  inside  the  unit 
circle.  Then  it  follows  from  the  discussion  following  (11.7)  that  the  posi¬ 
tive  sign  holds  in  (7.1)  and  (7.2),  and  therefore  in  (11.9).  By  Lemma  1,  F 
is  a  singular  Trench  matrix  characterized  by  the  polynomials 

A(x)  -  B(x)  -  (x  -  l)s  A(x) .  (11.10) 


Let 

f (x)  -  A(x)  B (1/x)  -  (x1/2  -  x"1/2)2s  q(x) . 

Then, 

*‘(t)  -  f(eit).  (11.11) 

Let  H  denote  the  maximum  value  of  f(x)  on  the  unit  circle.  Then  by  (11.6) 
M  ^  2,  or 

1  <_  2/M.  (11.12) 

Consequently,  by  Corollary  1,  the  family  of  matrices  is  stable. 

Conversely,  suppose  that  the  family  {G„>  is  stable,  in  addition  to  the 

hypotheses  in  the  first  sentence  of  the  theorem.  Since  G„  is  symmetric, 

its  eigenvalues  are  real,  and  stability  implies  (oldenburger  1940;  Dresden 

1942)  that  all  its  eigenvalues  are  in  the  half-open  interval  (-1,1).  In 

other  words,  all  the  eigenvalues  of  Fu  are  in  (0,2)  for  all  N.  Now,  if 

v  is  an  arbitrary  column  vector  of  real  elements,  it  is  well  known  that  the 

T  T 

minimum  value  of  the  Rayleigh  quotient  v  Fv/v  v  is  the  (algebraically) 


smallest  eigenvalue  of  F  .  Suppose  the  minus  sign  holds  in  (11.8)  and  let 

v  be  the  unit  vector  with  1  as  its  first  element  and  all  the  other  elements 

0  .  By  (11.10),  the  constant  term  of  A(x)  is  (-1)5  aQ,  and  the  Rayleigh 
2 

quotient  is  -o  ,  which  is  negative  since  aQ  5*  0  by  the  definition  of  a 
Trench  matrix.  Thus,  F  has  a  negative  eigenvalue,  in  contradiction  to  the 
statement  that  all  its  eigenvalues  are  in  [0,2) .  Therefore  the  supposition 
that  the  minus  sign  holds  in  (11.8)  is  false. 

Since  the  positive  sign  holds  in  (11.8),  F  belongs  to  the  class  of  mat¬ 
rices  to  which  Corollary  1  applies.  Thus  stability  of  the  family  implies 

that  A(x)  has  no  zero  inside  the  unit  circle  unless  it  is  also  a  zero  of 
A(l/x).  But  a  common  zero  of  A(x)  and  A(l/x)  would  imply  that  D  is  sin¬ 
gular,  which  would  contradict  property  (4).  Therefore  A(x)  has  no  zero  in¬ 
side  the  unit  circle.  Stability  implies  further  that  (11.12)  holds,  where  M 
is  defined  as  before,  and  this  implies  in  turn  that  M  ^  2,  which,  in  view  of 
(11.11),  is  tantamount  to  (11.6)  and  therefore  to  (11.3).  ■ 

It  is  easily  verified  that  G  ,  when  it  exists,  is  the  orthogonal  pro¬ 
jector  on  the  eigenspace  of  G  associated  with  the  eigenvalue  1  ,  that  is 
the  space  of  N-vectors  whose  components  are  successive  equally  spaced  ordi¬ 
nates  of  polynomials  of  degree  s  -  1  or  less. 

There  is  a  curious  unsolved  mathematical  problem  concerning  the  stability 
theorem.  It  will  be  recalled  that  the  symmetry  of  G  was  not  included  in  the 
hypothesis  of  Theorem  5.1.  Rather  this  was  a  consequence  of  the  general  hypo¬ 
theses  and  the  five  defining  properties.  However,  in  Theorem  11.1  the  sym¬ 
metry  of  G  is  hypothesized.  While  symmetry  of  the  main  part  of  G  follows 
from  the  symmetry  of  the  coefficients  in  the  main  formula  and  properties  (1) 
to  (4),  the  special  corner  submatrices  are  not  symnetric  unless  A(x)  is 
chosen  so  that  B(x)  -  A(x).  When  the  characteristic  function  of  the  given 


MWA  satisfies  (11.3),  we  might  wish  to  replace  property  (5)  by  the  1  quj.r*. - 
ment  that  the  family  G^  be  stable,  and  still  hope  to  have  G  uniquely 
determined.  At  present  this  appears  to  require  the  additional  hypothesis  thr*- 
G  be  symmetric,  because  the  proof  of  stability  (Greville  1980)  involves  ex¬ 
tensive  use  of  the  well  known  relation  between  Rayleigh  quotients  and  eigen¬ 
values  that  holds  only  for  Hermitian  (including  symmetric)  matrices.  If 
q(z)  has  a  number  of  zeros  (none,  we  assume,  on  the  unit  circle),  there  are, 
in  general,  seme  possible  extensions  with  unsymmetric  corners.  I  conjecture 
that  there  is,  in  such  a  case,  no  unsymmetric  stable  extension,  but  I  have  not 
been  able  to  prove  this;  nor  have  I  been  able  to  find  a  counter-example  to  the 
conjecture.  Thus  the  possibility  exists  (though  I  think  it  unlikely)  that 
some  symmetrical  MWA  (with  qQ  >  0  and  no  zeros  on  the  unit  circle)  might 
have  more  than  one  stable  extension,  the  unique  symmetric  one  and  an  unsym¬ 
metric  one  as  well. 

It  may  be  mentioned,  however,  that  there  are  cogent  reasons  for  thinking 
that  G  should  be  symmetric.  A  square  matrix  is  called  persymmetric  if  it  is 
symmetric  about  its  secondary  diagonal.  It  is  called  centrosynmetric  if  it  is 
symmetric  about  the  center  of  the  matrix:  thus  C  ■  (c^ )  is  centrosynmetric 
if  c„  =  cN_j+1  N-i+1  for  Now,  it  is  easily  seen  that  of  the 

three  properties  of  symmetry,  persymmetry,  and  centrosymmetry ,  any  two  imply 
the  third.  G  is  necessarily  persymmetric,  because  G  -  I  -  F,  where  F  is 
a  Trench  matrix,  and  every  Trench  matrix  is  persymmetric  (see  Greville  and 
Trench  1979).  Therefore,  if  G  is  not  symmetric,  it  is  not  centrosynmetric. 
Now,  if  G  is  not  centrosynmetric,  this  means  that  reversing  the  order  of  the 
observed  values  would  not  merely  reverse  the  order  of  the  smoothed  values,  but 
would  cause  different  numerical  values  to  be  obtained.  For  example,  the  ele¬ 
ments  of  the  bottom  row  of  G  would  not  be  those  of  the  top  row  in  reverse 
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order.  The  formula  for  smoothing  the  last  observation  would  not  be  the  mirror 
image  of  the  one  for  smoothing  the  first  observation,  but  would  be  a  different 
formula.  This  would  seem  to  be  an  undesirable  characteristic  of  the  smoothing 
process . 

12.  SMOOTHING  FORMULAS  IN  THE  STRICT  SENSE  AND  AN  OPTIMAL  PROPERTY 
Under  certain  conditions  the  smoothing  procedure  described  herein  can  be 
shown  to  minimize  a  certain  "loss  function"  analogous  to  the  Whittaker  crite¬ 
rion.  In  a  slightly  more  general  form  of  the  Whittaker  smoothing  method 
(Greville  1957)  one  minimizes  the  sum  of  the  squares  of  the  departures  of  the 
smoothed  values  from  the  observed  values  plus  a  specified  quadratic  form  in 
the  sth  differences  of  the  smoothed  values.  In  matrix  terms,  one  minimizes 

(u  -  y)T  (u  -  y)  +  (Ku)T  HKu  ,  (12.1) 

where  H  is  a  given  positive  definite  matrix  of  order  N  -  s.  Minimization 
of  this  expression  leads  to  the  equation 

(I  +  JCT  HK)u  -  y  , 

T 

which  has  a  unique  solution  for  u  since  I  +  K  HK  is  positive  definite. 

I  showed  (Greville  1957)  that  this  graduation  method  has  the  interesting  prop¬ 
erty  that  if  roughness  (opposite  of  smoothness)  is  measured  by  the  term 
T 

(Ku)  HKu,  smoothness  is  always  increased  by  the  graduation.  By  Theorem  5.22 
of  Noble  (1969) , 

T  -1  T  -1  T  -1 

(I  +  K  HK)  -  I  -  K  (H  +  KK  )  K. 

The  last  expression  is  of  the  form  (5.2)  and  suggests  that  the  use  of  an  MWA 
with  the  natural  extension  might  be  regarded  as  a  generalized  Whittaker 
smoothing  process  if 

-1  T  -1 
D  «  (H  +  KK  )  . 

Solving  for  H  gives 

H  -  (d"1  -  KKV1  .  (12.2) 
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We  are  led  to  inquire,  therefore,  under  what  conditions  an  MWA  is  such 
that  the  right  member  of  (12.2)  for  the  natural  extension  is  positive  definite 


for  all  N 


We  note  in  passing  that 


KK 


is  a  Toeplitz  matrix. 

Schoenberg  (1946,  p.  53)  remarks  that  it  is  desirable  for  an  efficient 
smoothing  formula,  one  that  achieves  adequate  smoothness  without  producing 
unnecessarily  large  departures  from  the  observed  values,  to  have  its  charac¬ 
teristic  function  satisfy  the  stronger  condition 


0  <  4(t)  <  1 


(12.3) 


for  all  t  .  This  remark  seems  to  have  been  little  noted  in  the  years  since 
its  publication.  We  shall  call  an  MWA  a  smoothing  formula  in  the  strict 
sense  if  its  characteristic  function  satisfies  (12.3). 

Lemma  12.1.  Under  the  natural  extension  of  a  given  MWA,  D  -  KK  is 
nonsingular  if  and  only  if  G  is  nonsingular,  and  H  defined  by  (12.2)  is 
positive  definite  if  and  only  if  G  is  positive  definite. 

Proof.  If 


G  -  I  -  KT  DK  ,  (12.4) 

as  in  (5.2),  then  by  Noble's  theorem 

—1  t  — 1  t  -1 

G  -  I  +  K  (D  -  KK  )  K  ,  (12.5) 


provided  G  and  D  are  nonsingular.  Under  the  natural  extension,  D  is 

always  nonsingular  by  property  (4).  In  the  procf  of  Noble's  theorem,  the 

-1  T 

nonsingularity  of  D  -  KK  iB  shown  to  follow  from  that  of  G  and  D  . 

-1  T 

On  the  other  hand,  if  D  -  KK  is  nonsingular,  multiplication  of  the  right 
members  of  (12.4)  and  (12.5)  gives  the  identity.  This  proves  the  first 


statement  of  the  lemma 


Now  let  H  be  positive  definite.  We  have  shown  that 

-1  T 

G  ■  I  +  K  HK  . 

Then,  if  v  is  an  arbitrary  nonzero  real  vector, 

vT  G*1  v  =  vT  v  +  (Kv)T  HKv.  (12.6) 

The  second  term  of  the  right  member  of  (12.6)  is  nonnegative,  since  H  is 
positive  definite,  and  the  first  term  is  positive.  It  follows  that  G  1  , 
and  therefore  G  ,  is  positive  definite. 

Conversely,  let  G  be  positive  definite.  Applying  Noble's  theorem  to 
(12.2)  gives 

H  =  D  +  DK(I  -  KT  DK)”1  KT  D  =  D  +  DKG_1  RT  D. 

Now,  we  note  that  under  the  natural  extension  D  is  positive  definite 

(Greville  1980,  Theorem  1),  since  all  the  zeros  of  A(x)  are  outside  the  unit 

T 

circle.  Thus ,  the  same  argument  used  previously  shows  that  v  Hv  >  0  for 
every  nonzero  real  vector  v  ,  and  so  H  is  positive  definite.  ■ 

Theorem  12.2.  Under  the  natural  extension  of  a  given  MW  A,  H  is  posi¬ 
tive  definite  for  all  N  if  and  only  if  $(t)  satisfies  (12.3). 

Proof.  By  Lemma  12.1,  H  is  positive  definite  if  and  only  if  G  is 
positive  definite;  therefore  we  need  consider  only  the  positive  definiteness 
of  G  .  We  recall  that  G  *  I  -  F  ,  where  F  is  a  singular,  symmetric  Trench 
matrix  characterized  by  two  identical  polynomials  equal  to  A(x).  since  all 
the  zeros  of  A(x) ,  with  the  exception  of  +1  ,  are  outside  the  unit  circle, 

F  is  positive  samidefinite  (Greville  1980,  Theorem  1),  and  if 

f(x)  -  A(x)  A(l/x) , 

then 

*(t)  -  fie11)  -  |A(eit:)|2 

is  nonnegative  for  all  t  .  Let  M  denote  the  maximum  of  ip(t) . 
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Let  <Mt)  satisfy  (12.3).  Since  <j»  (t)  -  1  -  (Ht) ,  it  follows  uhat 

0  <_  <(>( t)  <_  1  (j-2  ~) 

for  all  t  .  Therefore  M  <_  1  ,  and  it  follows  (Greville  1980,  Theorem  2) 

that  for  all  N  all  eigenvalues  of  F  are  nonnegative  and  less  than  unity. 

Since  the  eigenvalues  of  G  are  1  minus  those  of  F  ,  all  of  the  former  are 

positive  for  all  N  ,  and  therefore  G  is  positive  definite  for  all  N  . 

Conversely,  let  G  be  positive  definite  for  all  N  .  Then  all  its  eige 

values  are  positive  for  all  N,  and  consequently  those  of  F  are  less  them 

unity  (but  not  less  than  zero,  since  F  is  positive  semidefinite) .  Since  M 

is  the  limit  of  the  largest  eigenvalue  as  N  approaches  infinity  (Greville 

1980,  Theorem  2),  M  <_  1.  Therefore  (12.7)  holds,  and  it  is  equivalent  to 

(12.3).  ■ 

It  is  easy  to  construct  an  MWA  that  is  a  smoothing  formula  in  the  strict 

sense.  However,  none  of  the  weighted  averages  in  common  use  fall  in  this 

class.  As  a  practical  matter,  the  smoothing  effected  by  such  formulas  is 

likely  to  be  too  "gentle."  In  particular,  using  the  properties  of  Jacobi 

polynomials,  I  have  shown  elsewhere  (Greville  1966)  that  the  characteristic 

functions  of  all  minimum-R  averages  assume  negative  values  in  (0,  2n) . 

s 

Thus  no  such  formula  is  a  smoothing  formula  in  the  strict  sense. 

There  is,  however,  one  family  of  moving  averages,  mentioned  in  the  liter¬ 
ature  but  not  in  general  use,  that  are  smoothing  formulas  in  the  strict  sense. 

This  is  the  limiting  case  of  the  minimum-R  formulas  as  s  approaches  infin- 

s 

ity  (Greville  1966) .  In  finite-difference  form,  the  minim\a-Rn  MWA  of 


2m  +  1  terms,  exact  for  the  degree  2s  -  1  ,  is 

sjl  <23  y,  . 

j-0  3 

where  the  operator  y  is  defined  by 


yf(x)  -  i[f(x  +  i)  +  f (x  -  *)]  , 

2  2 

so  that  w  ■  1  +  £6  .  The  characteristic  function  is 
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$(t)  -  (cos  it)2*m  s+1^  £  (m  ®+3)  sin23  it, 

j-0  3 

which  is  nonneyative  in  0  <  t  <  2ir,  with  a  single  zero  of  multiplicity 
2(m  -  s  +  1)  at  t-ir. 

It  may  be  mentioned  that,  in  the  case  where  $(t)  assumes  some  negative 
values  (and  G  and  H  are  nonsingular),  though  the  expression  (12.1)  does 
not  have  an  extremum,  the  natural  extension  of  the  graduation  does  corre¬ 
spond  to  a  saddle  point  of  (12.1).  It  is  not  clear  what  significance  this 
observation  may  have. 
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1*  Monthly  Precipitation  (inches) ,  Kadis on,  Wisconsin,  1967-71 


Tear  and  Month 

Observed 

Value 

Value 

Tear  and  Month 

Observed 

Value 

Graduated 

Value 

1967  January 

1.63 

1.11 

1969 

July 

4.28 

3.81 

February 

1.17 

1.63 

August 

0.96 

3.17 

March 

1.49 

2.24 

September 

1.35 

2.33 

April 

2.57 

2.88 

October 

2.65 

1.56 

Hay 

3.53 

3.42 

November 

0.70 

1.06 

June 

6.46 

3.74 

December 

1.66 

0.82 

July 

2.51 

3.85 

1970 

January 

0,44 

0.90 

August 

2.71 

3.75 

February 

0.16 

1.25 

September 

2,68 

3.42 

March 

1.17 

1.78 

October 

5.52 

2.92 

April 

2.53 

2.39 

November 

1.83 

2.31 

May 

6.09 

2.94 

December 

1.89 

1.69 

June 

2.26 

3.37 

1968  January 

0.56 

1.31 

July 

2.42 

3.63 

February 

0.49 

1.36 

August 

0.97 

3.69 

March 

0.59 

1.87 

September 

8.82 

3.50 

April 

4.18 

2.69 

October 

2.65 

3.20 

May 

2,02 

3.49 

November 

1.06 

2.74 

Ame 

7.82 

3.91 

December 

2.12 

2.28 

July 

2.54 

3.92 

1971 

January 

1.48 

1.94 

August 

2.58 

3.54 

February 

2.59 

1.76 

September 

4.45 

2.97 

March 

1.52 

1.74 

October 

0.85 

2.45 

April 

2.42 

1.81 

November 

1.74 

1.99 

May 

0.98 

1.93 

December 

2.89 

1.64 

June 

2.27 

2.02 

19&9  January 

2.26 

1.56 

July 

1.65 

2.13 

February 

0.18 

1.81 

August 

3.96 

2.24 

March 

1.4? 

2.35 

September 

1.87 

2.40 

April 

2.72 

3.13 

October 

1.30 

2.63 

"*y 

3.45 

3.81 

November 

3.48 

2.84 

June 

7.96 

4,05 

December 

3.64 

3.28 

SOUBCKi  Obserrsd  valve*  fro*  0,  S.  Department  of  Commerce,  National 
Oceanic  and  Atneepherlc  Administration,  Envixonaental  Data  Service, 
local  Climatological  Data.  Annual  Summary  with  Comparative  Data. 
Hadis on.  Wleccnaln.  1972.  National  Cllnatlc  Center,  Asheville,  N.  C., 
1973. 
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2.  Korlng-Amtgi  Coefficients*^)  ud  Extension 
Coefficients  (»j)  of  Minixus-ft^  ("Henderson's  Ideal") 
Avenges  of  5  to  23  Terns  Exact  for  Cublcs 


Kusber  of  Terns 

rj- 

? 

9  H 

13 

1 

aJ 

e5  aJ 

*J  eJ  *J 

CJ  ‘j 

0  .559440 

.412588 

.331140  .277944 

.240058 

1  ,293706 

2 

.293706  1.618034 

.266557  1.352613  .238693  1.160811 

.214337  1.016301 

2  *.073426 

-1 

.058741  -.236068 

.118470  ,114696  .141268  .281079 

.147356  .360880 

3 

-.058741  -.381966 

-.009873  -.287231  .035723  -.140968 

.065492  -.021625 

4 

-.040724  -.180078  -.026792  -.204545 

0  -.160909 

5 

-.027864  -.096377 

-.027864  -.138330 

6 

-.019350  -.056317 

*Osleole.tsd  by  fersuls  (9*5)* 
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a .  * 

2*  Moving-Average  Coefficient*  (Cj)  end  Extension 
Coefficients  (»j)  of  Dininun-B^  ("Henderson's  Ideal”) 
Averages  of  5  to  23  Terns  Exact  for  Cubic*  (continued) 


Dumber  of  Terns 

15  1?  19  21  23 


J  CJ  ll  °3  It _ li¬ 

ft  .211542  .189232  .171266 

1  .193742  .903661  .176390  .813444  .161691 

2  .145904  .397295  .141112  .410885  .134965 

3  .082918  .064751  .092293  .124932  .096658 

4  .024028  -.100710  .042093  -.043456  .054685 

5  -.014134  -.135445  .002467  -.110644  .017474 

6  -.024499  -.094424  -.018640  -.106213  -.008155 

7  -.013730  -.035128  -.020370  -.065896  -.018972 

8  -.009961  -.023Q52  -.016601 

9  -.007378 
10 

11 


-  -*3 _ . li  . 

.156470  .144060 

.739580  .149136  .678000  .138318  .625880 
.412090  .128423  .406495  .121949  .397207 
.166162  .097956  .193174  .097395  .212501 
.005097  .063038  .046016  .068303  .075236 
-.078255  .029628  -.046290  .038933  -.015313 
-.099972  .003119  -.084020  .013430  -.063927 
-.081843  -.012896  -.084711  -.004948  -.078737 
-.047103  -.017614  -.063066  -.014527  -.070064 
-.015756  -.013455  -.034444  -.015687  -.048977 
-.005570  -.011134  -.010918  -.025714 
-.004278  -.008092 


a Calculated  by  formula  (9.5). 
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****  ■  V** 


••  ■ ft* •  *♦ ’ raw*"  f. 


3*  Moving-Average  Coefficients  (c j)  and  Extension 
Coefficients  (*j)  of  Seleeted  Moving  Averages 


Macaulay* 

Spencer 

15-Tereb 

Veelkousee 

Hard/ 

Klghae*  Karupf 

320c,  a, 

125o,  a, 

1J5.,  U5., 

0 

162 

74 

23 

24 

25 

125 

1 

171 

,919760 

67 

.961572 

24 

.885108 

22 

.739988 

24 

.859550 

114 

.820240 

2 

127 

.393023 

46 

.372752 

21 

.421982 

17 

.386211 

18 

.399283 

87 

.402924 

3 

72 

.055273 

21 

.015904 

7 

.028721 

10 

.124325 

10 

.087040 

53 

.114622 

4 

17 

-.113111 

3 

-.123488 

3 

-.076050 

4 

-.023648 

3 

-.072738 

21 

-.047133 

5 

1 

H 

-.140462 

-5 

-.125229 

0 

-.107285 

0 

-.080067 

0 

-.104527 

0 

-.102491 

6 

-19 

-.084512 

-6 

-.075887 

-2 

-.092723 

-2 

-.079459 

-2 

-.093953 

—8 

-.091791 

7 

-10 

-.029971 

-3 

-.025624 

-3 

-.0(59753 

-2 

-.049327 

-2 

-.055312 

-9 

-.060239 

8 

-1 

-.018OQ3 

-1 

-.019343 

-6 

-.028636 

9 

-2 

-.007496 

*Heoaulay  1931.  p.  53*  footnote  2. 

'Woaulay  1931,  p.  55l  Henderson  1938,  p.  53. 

°Henderson  1938,  p.  53. 

Slenders  on  1938,  ?.  53t  Benjamin  and  Haycocks  1970,  p.  238. 
*Henderson  1938,  p,  53. 

^Henderson  1938,  p.  53. 
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3,  Kovlng'ivmg*  Coefficient*  (cj)  and  Extension 
Coefficients  <Sj)  of  Selected  Moving  Avenges  (continued) 


Spencer.  Hardy  ,  Vaughan  .  . 

Andrews8  21-Tern  Wave-Cutting1  Foraula  AJ  Kenehlngton* 


5  10060c 

j  _  V 

350o 

J 

‘*1  *j.  . 

1440c 

A  Vi 

385c 

.1  *1 

0 

1688 

60 

5 

182 

45 

1 

1379 

.7007b? 

5f 

.729724 

5  .480996 

179 

.593236 

44 

.527740 

t 

1323 

.406808 

47 

.408707 

6  .368708 

170 

.396409 

41 

.370688 

3 

950 

.1797*9 

33 

.167281 

7  .267940 

149 

.230238 

36 

.236445 

A 

331 

.027155 

18 

.009255 

7  .166506 

U5 

.096761 

30 

.128638 

5 

225 

-.05*586 

6 

-.069703 

6  .072964 

72 

-.000857 

22 

.043118 

6 

-.083701 

-2 

-.091513 

4  -.008222 

29 

-.060076 

13 

-.018390 

7 

-12b 

-.078256 

-5 

-.076165 

1  -.075434 

•5 

-.083321 

5 

-.053902 

8 

-135 

-.05*368 

-5 

-.049051 

-1  -.097387 

•26 

-.079596 

-1 

-.067080 

9 

-no 

-.031120 

-3 

-.022502 

-2  -.009039 

-29 

-.056662 

-5 

-.064844 

10 

-61 

-.012428 

-1 

-.006033 

• 

1 

7 

-19 

1 

• 

8 

8 

-6 

-.050323 

11 

-i  -.024996 

-6 

-.007395 

-5 

-.032035 

It 

-3 

-.015626 

13 

-1 

-.004429 

^Andrews  and  Kenbltt  1965,  p,  18. 

*H»ot*iay  1931.  p.  51 1  Henderson  1938,  p.  53* 
Htajuda  and  Xayoooks  1970,  p.  239. 

J«m*m  1933.  P.  4J7. 

Sanderson  1938,  p,  33. 
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Figure  a.  Observed  and  Graduated  Values  of  Monthly 
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